Overview

Brought to you by YData

Dataset statistics

Number of variables75
Number of observations2733
Missing cells17571
Missing cells (%)8.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory600.0 B

Variable types

Numeric7
Categorical57
DateTime3
Boolean8

Alerts

%European(CEU)_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_race_paciente_specimens and 52 other fieldsHigh correlation
%European(CEU)_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 52 other fieldsHigh correlation
%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 59 other fieldsHigh correlation
%Native and LatinAmerican (NA)_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 59 other fieldsHigh correlation
%West African(YRI)_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 47 other fieldsHigh correlation
%West African(YRI)_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 47 other fieldsHigh correlation
Able to ViablyPassage in nude mice_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 41 other fieldsHigh correlation
AdditionalMedicalHistory_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 56 other fieldsHigh correlation
Age atDiagnosis_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 52 other fieldsHigh correlation
Age atSampling_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Best Response_trathistory_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 33 other fieldsHigh correlation
BiologicalSex_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 46 other fieldsHigh correlation
BiopsySite_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 56 other fieldsHigh correlation
Chr End_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Chr Start_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Chr_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Consensus Whole ExomeSequence Avail_specimen_specimens is highly overall correlated with AdditionalMedicalHistory_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
DiagnosisSubtype_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 55 other fieldsHigh correlation
DiagnosisSubtype_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 55 other fieldsHigh correlation
DiagnosisSubtype_pathology is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 55 other fieldsHigh correlation
DiagnosisSubtype_samples is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 55 other fieldsHigh correlation
DiagnosisSubtype_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 55 other fieldsHigh correlation
DiagnosisSubtype_trathistory_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 55 other fieldsHigh correlation
Ethnicity_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 59 other fieldsHigh correlation
Existing Variant_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 52 other fieldsHigh correlation
Grade StageInformation_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 47 other fieldsHigh correlation
HGVS ProteinChange_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
HGVS cDNAChange_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Has KnownMetastaticDisease_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 44 other fieldsHigh correlation
Has Smoked100 Cigarettes_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 50 other fieldsHigh correlation
HugoSymbol_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Human PathogenTesting Summary_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 47 other fieldsHigh correlation
InferredAncestry_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 56 other fieldsHigh correlation
InferredAncestry_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 56 other fieldsHigh correlation
Inflammatory Cell_pathology is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 45 other fieldsHigh correlation
MSI Status_specimen_specimens is highly overall correlated with AdditionalMedicalHistory_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
ModelNotes_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 59 other fieldsHigh correlation
Molecular andIHC Data_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 57 other fieldsHigh correlation
MutationEffect_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Necrosis_pathology is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 12 other fieldsHigh correlation
Occupation_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 56 other fieldsHigh correlation
OncoKB Cancer GenePanel Data Avail_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 9 other fieldsHigh correlation
Oncogenicity_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
PDM Type_pathology is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 12 other fieldsHigh correlation
PDM Type_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 9 other fieldsHigh correlation
PDX GrowthCurve Avail_specimen_specimens is highly overall correlated with AdditionalMedicalHistory_df_pinfo_paciente_specimens and 51 other fieldsHigh correlation
Passage_pathology is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 17 other fieldsHigh correlation
Passage_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 16 other fieldsHigh correlation
Pathology Notes_pathology is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 48 other fieldsHigh correlation
PathologyAvail_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 22 other fieldsHigh correlation
Patient ID is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 53 other fieldsHigh correlation
Patient/OriginatingSpecimen_pathology is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 12 other fieldsHigh correlation
Patient/OriginatingSpecimen_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 9 other fieldsHigh correlation
PatientNotes_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 57 other fieldsHigh correlation
ProvidedTissue Origin_specimen_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 49 other fieldsHigh correlation
RNASeqAvail_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 9 other fieldsHigh correlation
Race_df_pinfo_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 52 other fieldsHigh correlation
Sample ID_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 45 other fieldsHigh correlation
Sample ID_pathology is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 48 other fieldsHigh correlation
Sample ID_samples is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 47 other fieldsHigh correlation
Self-ReportedEthnicity_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 59 other fieldsHigh correlation
Self-ReportedRace_df_race_paciente_specimens is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 52 other fieldsHigh correlation
Specimen ID is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 56 other fieldsHigh correlation
StandardizedRegimen_trathistory_paciente_specimens is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 34 other fieldsHigh correlation
Stromal_pathology is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 31 other fieldsHigh correlation
Timing_trathistory_paciente_specimens is highly overall correlated with Best Response_trathistory_paciente_specimens and 1 other fieldsHigh correlation
Total Reads_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 47 other fieldsHigh correlation
Tumor Content_pathology is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 12 other fieldsHigh correlation
Tumor Grade_pathology is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 43 other fieldsHigh correlation
Variant AlleleFrequency_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 48 other fieldsHigh correlation
VariantClass_gene is highly overall correlated with %European(CEU)_df_pinfo_paciente_specimens and 48 other fieldsHigh correlation
Whole ExomeSequence Avail_samples is highly overall correlated with %Native and LatinAmerican (NA)_df_pinfo_paciente_specimens and 9 other fieldsHigh correlation
DiagnosisSubtype_trathistory_paciente_specimens is highly imbalanced (76.5%) Imbalance
DiagnosisSubtype_df_race_paciente_specimens is highly imbalanced (76.5%) Imbalance
Self-ReportedRace_df_race_paciente_specimens is highly imbalanced (72.5%) Imbalance
Self-ReportedEthnicity_df_race_paciente_specimens is highly imbalanced (98.8%) Imbalance
%European(CEU)_df_race_paciente_specimens is highly imbalanced (73.6%) Imbalance
%Native and LatinAmerican (NA)_df_race_paciente_specimens is highly imbalanced (98.8%) Imbalance
%West African(YRI)_df_race_paciente_specimens is highly imbalanced (67.8%) Imbalance
InferredAncestry_df_race_paciente_specimens is highly imbalanced (77.4%) Imbalance
BiologicalSex_df_pinfo_paciente_specimens is highly imbalanced (57.0%) Imbalance
DiagnosisSubtype_df_pinfo_paciente_specimens is highly imbalanced (76.5%) Imbalance
%European(CEU)_df_pinfo_paciente_specimens is highly imbalanced (73.6%) Imbalance
%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens is highly imbalanced (98.8%) Imbalance
%West African(YRI)_df_pinfo_paciente_specimens is highly imbalanced (67.8%) Imbalance
AdditionalMedicalHistory_df_pinfo_paciente_specimens is highly imbalanced (72.2%) Imbalance
Ethnicity_df_pinfo_paciente_specimens is highly imbalanced (98.8%) Imbalance
Grade StageInformation_df_pinfo_paciente_specimens is highly imbalanced (56.9%) Imbalance
Has Smoked100 Cigarettes_df_pinfo_paciente_specimens is highly imbalanced (74.1%) Imbalance
InferredAncestry_df_pinfo_paciente_specimens is highly imbalanced (77.4%) Imbalance
Molecular andIHC Data_df_pinfo_paciente_specimens is highly imbalanced (68.5%) Imbalance
Occupation_df_pinfo_paciente_specimens is highly imbalanced (58.8%) Imbalance
PatientNotes_df_pinfo_paciente_specimens is highly imbalanced (73.3%) Imbalance
Race_df_pinfo_paciente_specimens is highly imbalanced (72.5%) Imbalance
Specimen ID is highly imbalanced (60.4%) Imbalance
BiopsySite_specimen_specimens is highly imbalanced (58.6%) Imbalance
DiagnosisSubtype_specimen_specimens is highly imbalanced (76.5%) Imbalance
PDX GrowthCurve Avail_specimen_specimens is highly imbalanced (97.7%) Imbalance
Consensus Whole ExomeSequence Avail_specimen_specimens is highly imbalanced (97.7%) Imbalance
MSI Status_specimen_specimens is highly imbalanced (97.7%) Imbalance
Human PathogenTesting Summary_specimen_specimens is highly imbalanced (70.8%) Imbalance
Able to ViablyPassage in nude mice_specimen_specimens is highly imbalanced (70.0%) Imbalance
ModelNotes_specimen_specimens is highly imbalanced (80.6%) Imbalance
DiagnosisSubtype_samples is highly imbalanced (76.5%) Imbalance
PathologyAvail_samples is highly imbalanced (95.9%) Imbalance
HugoSymbol_gene is highly imbalanced (60.9%) Imbalance
Chr_gene is highly imbalanced (60.9%) Imbalance
DiagnosisSubtype_pathology is highly imbalanced (76.5%) Imbalance
Patient/OriginatingSpecimen_pathology is highly imbalanced (51.3%) Imbalance
PDM Type_pathology is highly imbalanced (51.3%) Imbalance
Tumor Grade_pathology is highly imbalanced (80.9%) Imbalance
Necrosis_pathology is highly imbalanced (57.2%) Imbalance
DiagnosisSubtype_trathistory_paciente_specimens has 543 (19.9%) missing values Missing
Date RegimenStarted_trathistory_paciente_specimens has 402 (14.7%) missing values Missing
Best Response_trathistory_paciente_specimens has 2545 (93.1%) missing values Missing
DiagnosisSubtype_df_race_paciente_specimens has 543 (19.9%) missing values Missing
DiagnosisSubtype_df_pinfo_paciente_specimens has 543 (19.9%) missing values Missing
AdditionalMedicalHistory_df_pinfo_paciente_specimens has 311 (11.4%) missing values Missing
Molecular andIHC Data_df_pinfo_paciente_specimens has 335 (12.3%) missing values Missing
PatientNotes_df_pinfo_paciente_specimens has 331 (12.1%) missing values Missing
DiagnosisSubtype_specimen_specimens has 543 (19.9%) missing values Missing
ModelNotes_specimen_specimens has 513 (18.8%) missing values Missing
DiagnosisSubtype_samples has 543 (19.9%) missing values Missing
Passage_samples has 309 (11.3%) missing values Missing
Sample ID_gene has 627 (22.9%) missing values Missing
HugoSymbol_gene has 627 (22.9%) missing values Missing
Chr_gene has 627 (22.9%) missing values Missing
Chr End_gene has 627 (22.9%) missing values Missing
Chr Start_gene has 627 (22.9%) missing values Missing
HGVS ProteinChange_gene has 627 (22.9%) missing values Missing
HGVS cDNAChange_gene has 627 (22.9%) missing values Missing
Total Reads_gene has 627 (22.9%) missing values Missing
Variant AlleleFrequency_gene has 627 (22.9%) missing values Missing
VariantClass_gene has 627 (22.9%) missing values Missing
MutationEffect_gene has 627 (22.9%) missing values Missing
Oncogenicity_gene has 627 (22.9%) missing values Missing
Existing Variant_gene has 1599 (58.5%) missing values Missing
DiagnosisSubtype_pathology has 543 (19.9%) missing values Missing
Passage_pathology has 297 (10.9%) missing values Missing
Stromal_pathology has 218 (8.0%) zeros Zeros

Reproduction

Analysis started2025-07-15 01:50:05.536798
Analysis finished2025-07-15 01:50:33.234352
Duration27.7 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

Patient ID
Real number (ℝ)

High correlation 

Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean237938.16
Minimum111316
Maximum949853
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:33.304704image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum111316
5-th percentile111316
Q1111316
median111316
Q3111316
95-th percentile814656
Maximum949853
Range838537
Interquartile range (IQR)0

Descriptive statistics

Standard deviation245399.81
Coefficient of variation (CV)1.0313596
Kurtosis1.2333676
Mean237938.16
Median Absolute Deviation (MAD)0
Skewness1.6514702
Sum6.50285 × 108
Variance6.0221068 × 1010
MonotonicityIncreasing
2025-07-15T01:50:33.380301image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
111316 2106
77.1%
627122 200
 
7.3%
429767 128
 
4.7%
814656 108
 
4.0%
636974 98
 
3.6%
949853 84
 
3.1%
738633 3
 
0.1%
246632 2
 
0.1%
358529 2
 
0.1%
899375 2
 
0.1%
ValueCountFrequency (%)
111316 2106
77.1%
246632 2
 
0.1%
358529 2
 
0.1%
429767 128
 
4.7%
627122 200
 
7.3%
636974 98
 
3.6%
738633 3
 
0.1%
814656 108
 
4.0%
899375 2
 
0.1%
949853 84
 
3.1%
ValueCountFrequency (%)
949853 84
 
3.1%
899375 2
 
0.1%
814656 108
 
4.0%
738633 3
 
0.1%
636974 98
 
3.6%
627122 200
 
7.3%
429767 128
 
4.7%
358529 2
 
0.1%
246632 2
 
0.1%
111316 2106
77.1%

DiagnosisSubtype_trathistory_paciente_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing543
Missing (%)19.9%
Memory size21.5 KiB
epithelioid type
2106 
spindle type
 
84

Length

Max length16
Median length16
Mean length15.846575
Min length12

Characters and Unicode

Total characters34704
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowepithelioid type
2nd rowepithelioid type
3rd rowepithelioid type
4th rowepithelioid type
5th rowepithelioid type

Common Values

ValueCountFrequency (%)
epithelioid type 2106
77.1%
spindle type 84
 
3.1%
(Missing) 543
 
19.9%

Length

2025-07-15T01:50:33.481152image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:33.546869image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
type 2190
50.0%
epithelioid 2106
48.1%
spindle 84
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Timing_trathistory_paciente_specimens
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Prior
1385 
Current
1348 

Length

Max length7
Median length5
Mean length5.9864618
Min length5

Characters and Unicode

Total characters16361
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCurrent
2nd rowCurrent
3rd rowCurrent
4th rowCurrent
5th rowCurrent

Common Values

ValueCountFrequency (%)
Prior 1385
50.7%
Current 1348
49.3%

Length

2025-07-15T01:50:33.636581image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:33.707522image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
prior 1385
50.7%
current 1348
49.3%

Most occurring characters

ValueCountFrequency (%)
r 5466
33.4%
P 1385
 
8.5%
i 1385
 
8.5%
o 1385
 
8.5%
C 1348
 
8.2%
u 1348
 
8.2%
e 1348
 
8.2%
n 1348
 
8.2%
t 1348
 
8.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 16361
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 5466
33.4%
P 1385
 
8.5%
i 1385
 
8.5%
o 1385
 
8.5%
C 1348
 
8.2%
u 1348
 
8.2%
e 1348
 
8.2%
n 1348
 
8.2%
t 1348
 
8.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 16361
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 5466
33.4%
P 1385
 
8.5%
i 1385
 
8.5%
o 1385
 
8.5%
C 1348
 
8.2%
u 1348
 
8.2%
e 1348
 
8.2%
n 1348
 
8.2%
t 1348
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 16361
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 5466
33.4%
P 1385
 
8.5%
i 1385
 
8.5%
o 1385
 
8.5%
C 1348
 
8.2%
u 1348
 
8.2%
e 1348
 
8.2%
n 1348
 
8.2%
t 1348
 
8.2%
Distinct11
Distinct (%)0.5%
Missing402
Missing (%)14.7%
Memory size21.5 KiB
Minimum2007-05-01 00:00:00
Maximum2021-08-01 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-07-15T01:50:33.779816image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:33.872663image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Imatinib Mesylate
1276 
Ripretinib
1053 
Treatment naive
286 
No Current Therapy
 
116
Sunitinib Malate
 
1

Length

Max length18
Median length17
Mean length14.132821
Min length9

Characters and Unicode

Total characters38625
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st rowRipretinib
2nd rowRipretinib
3rd rowRipretinib
4th rowRipretinib
5th rowRipretinib

Common Values

ValueCountFrequency (%)
Imatinib Mesylate 1276
46.7%
Ripretinib 1053
38.5%
Treatment naive 286
 
10.5%
No Current Therapy 116
 
4.2%
Sunitinib Malate 1
 
< 0.1%
Radiation 1
 
< 0.1%

Length

2025-07-15T01:50:33.969129image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:34.050431image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
imatinib 1276
28.2%
mesylate 1276
28.2%
ripretinib 1053
23.3%
treatment 286
 
6.3%
naive 286
 
6.3%
no 116
 
2.6%
current 116
 
2.6%
therapy 116
 
2.6%
sunitinib 1
 
< 0.1%
malate 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
i 6002
15.5%
e 4696
12.2%
t 4296
11.1%
a 3244
 
8.4%
n 3020
 
7.8%
b 2330
 
6.0%
1795
 
4.6%
r 1687
 
4.4%
m 1562
 
4.0%
y 1392
 
3.6%
Other values (15) 8601
22.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 38625
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 6002
15.5%
e 4696
12.2%
t 4296
11.1%
a 3244
 
8.4%
n 3020
 
7.8%
b 2330
 
6.0%
1795
 
4.6%
r 1687
 
4.4%
m 1562
 
4.0%
y 1392
 
3.6%
Other values (15) 8601
22.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 38625
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 6002
15.5%
e 4696
12.2%
t 4296
11.1%
a 3244
 
8.4%
n 3020
 
7.8%
b 2330
 
6.0%
1795
 
4.6%
r 1687
 
4.4%
m 1562
 
4.0%
y 1392
 
3.6%
Other values (15) 8601
22.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 38625
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 6002
15.5%
e 4696
12.2%
t 4296
11.1%
a 3244
 
8.4%
n 3020
 
7.8%
b 2330
 
6.0%
1795
 
4.6%
r 1687
 
4.4%
m 1562
 
4.0%
y 1392
 
3.6%
Other values (15) 8601
22.3%

Best Response_trathistory_paciente_specimens
Categorical

High correlation  Missing 

Distinct5
Distinct (%)2.7%
Missing2545
Missing (%)93.1%
Memory size21.5 KiB
Stable Disease
113 
PR
37 
CR
36 
<Unknown>
 
1
Non-evaluable
 
1

Length

Max length14
Median length14
Mean length9.3085106
Min length2

Characters and Unicode

Total characters1750
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)1.1%

Sample

1st rowPR
2nd rowStable Disease
3rd rowStable Disease
4th rowStable Disease
5th rowStable Disease

Common Values

ValueCountFrequency (%)
Stable Disease 113
 
4.1%
PR 37
 
1.4%
CR 36
 
1.3%
<Unknown> 1
 
< 0.1%
Non-evaluable 1
 
< 0.1%
(Missing) 2545
93.1%

Length

2025-07-15T01:50:34.429533image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:34.510075image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
stable 113
37.5%
disease 113
37.5%
pr 37
 
12.3%
cr 36
 
12.0%
unknown 1
 
0.3%
non-evaluable 1
 
0.3%

Most occurring characters

ValueCountFrequency (%)
e 341
19.5%
a 228
13.0%
s 226
12.9%
l 115
 
6.6%
b 114
 
6.5%
S 113
 
6.5%
113
 
6.5%
t 113
 
6.5%
D 113
 
6.5%
i 113
 
6.5%
Other values (14) 161
9.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1750
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 341
19.5%
a 228
13.0%
s 226
12.9%
l 115
 
6.6%
b 114
 
6.5%
S 113
 
6.5%
113
 
6.5%
t 113
 
6.5%
D 113
 
6.5%
i 113
 
6.5%
Other values (14) 161
9.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1750
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 341
19.5%
a 228
13.0%
s 226
12.9%
l 115
 
6.6%
b 114
 
6.5%
S 113
 
6.5%
113
 
6.5%
t 113
 
6.5%
D 113
 
6.5%
i 113
 
6.5%
Other values (14) 161
9.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1750
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 341
19.5%
a 228
13.0%
s 226
12.9%
l 115
 
6.6%
b 114
 
6.5%
S 113
 
6.5%
113
 
6.5%
t 113
 
6.5%
D 113
 
6.5%
i 113
 
6.5%
Other values (14) 161
9.2%

DiagnosisSubtype_df_race_paciente_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing543
Missing (%)19.9%
Memory size21.5 KiB
epithelioid type
2106 
spindle type
 
84

Length

Max length16
Median length16
Mean length15.846575
Min length12

Characters and Unicode

Total characters34704
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowepithelioid type
2nd rowepithelioid type
3rd rowepithelioid type
4th rowepithelioid type
5th rowepithelioid type

Common Values

ValueCountFrequency (%)
epithelioid type 2106
77.1%
spindle type 84
 
3.1%
(Missing) 543
 
19.9%

Length

2025-07-15T01:50:34.622420image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:34.693734image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
type 2190
50.0%
epithelioid 2106
48.1%
spindle 84
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Self-ReportedRace_df_race_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
White
2494 
Black or African American
 
236
Not Provided
 
3

Length

Max length25
Median length5
Mean length6.7347237
Min length5

Characters and Unicode

Total characters18406
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhite
2nd rowWhite
3rd rowWhite
4th rowWhite
5th rowWhite

Common Values

ValueCountFrequency (%)
White 2494
91.3%
Black or African American 236
 
8.6%
Not Provided 3
 
0.1%

Length

2025-07-15T01:50:34.786511image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:34.881964image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
white 2494
72.4%
black 236
 
6.9%
or 236
 
6.9%
african 236
 
6.9%
american 236
 
6.9%
not 3
 
0.1%
provided 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 18406
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 18406
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 18406
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Self-ReportedEthnicity_df_race_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Not Hispanic or Latino
2730 
Not Provided
 
3

Length

Max length22
Median length22
Mean length21.989023
Min length12

Characters and Unicode

Total characters60096
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot Hispanic or Latino
2nd rowNot Hispanic or Latino
3rd rowNot Hispanic or Latino
4th rowNot Hispanic or Latino
5th rowNot Hispanic or Latino

Common Values

ValueCountFrequency (%)
Not Hispanic or Latino 2730
99.9%
Not Provided 3
 
0.1%

Length

2025-07-15T01:50:34.974395image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:35.047072image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
not 2733
25.0%
hispanic 2730
25.0%
or 2730
25.0%
latino 2730
25.0%
provided 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 60096
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 60096
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 60096
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

%European(CEU)_df_race_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
100
2490 
18
 
128
0
 
112
73
 
3

Length

Max length3
Median length3
Mean length2.8701061
Min length1

Characters and Unicode

Total characters7844
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100
2nd row100
3rd row100
4th row100
5th row100

Common Values

ValueCountFrequency (%)
100 2490
91.1%
18 128
 
4.7%
0 112
 
4.1%
73 3
 
0.1%

Length

2025-07-15T01:50:35.139876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:35.220682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
100 2490
91.1%
18 128
 
4.7%
0 112
 
4.1%
73 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7844
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7844
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7844
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

%Native and LatinAmerican (NA)_df_race_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
0
2730 
27
 
3

Length

Max length2
Median length1
Mean length1.0010977
Min length1

Characters and Unicode

Total characters2736
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2730
99.9%
27 3
 
0.1%

Length

2025-07-15T01:50:35.303242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:35.367408image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2730
99.9%
27 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2736
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2736
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2736
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

%West African(YRI)_df_race_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
0
2497 
82
 
128
100
 
108

Length

Max length3
Median length1
Mean length1.125869
Min length1

Characters and Unicode

Total characters3077
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2497
91.4%
82 128
 
4.7%
100 108
 
4.0%

Length

2025-07-15T01:50:35.450786image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:35.519694image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2497
91.4%
82 128
 
4.7%
100 108
 
4.0%

Most occurring characters

ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3077
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3077
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3077
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

InferredAncestry_df_race_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
European (CEU)
2490 
West African (YRI)
 
236
Not Applicable
 
4
Mixed (All < 80%)
 
3

Length

Max length18
Median length14
Mean length14.348701
Min length14

Characters and Unicode

Total characters39215
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEuropean (CEU)
2nd rowEuropean (CEU)
3rd rowEuropean (CEU)
4th rowEuropean (CEU)
5th rowEuropean (CEU)

Common Values

ValueCountFrequency (%)
European (CEU) 2490
91.1%
West African (YRI) 236
 
8.6%
Not Applicable 4
 
0.1%
Mixed (All < 80%) 3
 
0.1%

Length

2025-07-15T01:50:35.608911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:35.684031image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
european 2490
43.6%
ceu 2490
43.6%
west 236
 
4.1%
african 236
 
4.1%
yri 236
 
4.1%
not 4
 
0.1%
applicable 4
 
0.1%
mixed 3
 
0.1%
all 3
 
0.1%
3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 39215
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 39215
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 39215
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

BiologicalSex_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Male
2492 
Female
 
241

Length

Max length6
Median length4
Mean length4.176363
Min length4

Characters and Unicode

Total characters11414
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowMale
3rd rowMale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Male 2492
91.2%
Female 241
 
8.8%

Length

2025-07-15T01:50:35.784082image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:35.857223image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male 2492
91.2%
female 241
 
8.8%

Most occurring characters

ValueCountFrequency (%)
e 2974
26.1%
a 2733
23.9%
l 2733
23.9%
M 2492
21.8%
F 241
 
2.1%
m 241
 
2.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11414
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2974
26.1%
a 2733
23.9%
l 2733
23.9%
M 2492
21.8%
F 241
 
2.1%
m 241
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11414
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2974
26.1%
a 2733
23.9%
l 2733
23.9%
M 2492
21.8%
F 241
 
2.1%
m 241
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11414
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2974
26.1%
a 2733
23.9%
l 2733
23.9%
M 2492
21.8%
F 241
 
2.1%
m 241
 
2.1%

DiagnosisSubtype_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing543
Missing (%)19.9%
Memory size21.5 KiB
epithelioid type
2106 
spindle type
 
84

Length

Max length16
Median length16
Mean length15.846575
Min length12

Characters and Unicode

Total characters34704
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowepithelioid type
2nd rowepithelioid type
3rd rowepithelioid type
4th rowepithelioid type
5th rowepithelioid type

Common Values

ValueCountFrequency (%)
epithelioid type 2106
77.1%
spindle type 84
 
3.1%
(Missing) 543
 
19.9%

Length

2025-07-15T01:50:35.948462image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:36.052646image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
type 2190
50.0%
epithelioid 2106
48.1%
spindle 84
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Age atDiagnosis_df_pinfo_paciente_specimens
Real number (ℝ)

High correlation 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.003293
Minimum31
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:36.148982image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum31
5-th percentile39
Q158
median58
Q358
95-th percentile58
Maximum80
Range49
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.2147681
Coefficient of variation (CV)0.11097148
Kurtosis3.6330809
Mean56.003293
Median Absolute Deviation (MAD)0
Skewness-2.0186807
Sum153057
Variance38.623342
MonotonicityNot monotonic
2025-07-15T01:50:36.279723image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
58 2106
77.1%
39 284
 
10.4%
53 128
 
4.7%
57 108
 
4.0%
66 98
 
3.6%
31 3
 
0.1%
51 2
 
0.1%
35 2
 
0.1%
80 2
 
0.1%
ValueCountFrequency (%)
31 3
 
0.1%
35 2
 
0.1%
39 284
 
10.4%
51 2
 
0.1%
53 128
 
4.7%
57 108
 
4.0%
58 2106
77.1%
66 98
 
3.6%
80 2
 
0.1%
ValueCountFrequency (%)
80 2
 
0.1%
66 98
 
3.6%
58 2106
77.1%
57 108
 
4.0%
53 128
 
4.7%
51 2
 
0.1%
39 284
 
10.4%
35 2
 
0.1%
31 3
 
0.1%

%European(CEU)_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
100
2490 
18
 
128
0
 
112
73
 
3

Length

Max length3
Median length3
Mean length2.8701061
Min length1

Characters and Unicode

Total characters7844
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100
2nd row100
3rd row100
4th row100
5th row100

Common Values

ValueCountFrequency (%)
100 2490
91.1%
18 128
 
4.7%
0 112
 
4.1%
73 3
 
0.1%

Length

2025-07-15T01:50:36.428860image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:36.538139image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
100 2490
91.1%
18 128
 
4.7%
0 112
 
4.1%
73 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7844
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7844
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7844
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 5092
64.9%
1 2618
33.4%
8 128
 
1.6%
7 3
 
< 0.1%
3 3
 
< 0.1%

%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
0
2730 
27
 
3

Length

Max length2
Median length1
Mean length1.0010977
Min length1

Characters and Unicode

Total characters2736
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2730
99.9%
27 3
 
0.1%

Length

2025-07-15T01:50:36.659632image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:36.744099image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2730
99.9%
27 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2736
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2736
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2736
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2730
99.8%
2 3
 
0.1%
7 3
 
0.1%

%West African(YRI)_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
0
2497 
82
 
128
100
 
108

Length

Max length3
Median length1
Mean length1.125869
Min length1

Characters and Unicode

Total characters3077
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2497
91.4%
82 128
 
4.7%
100 108
 
4.0%

Length

2025-07-15T01:50:36.867521image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:36.986242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2497
91.4%
82 128
 
4.7%
100 108
 
4.0%

Most occurring characters

ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3077
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3077
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3077
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2713
88.2%
8 128
 
4.2%
2 128
 
4.2%
1 108
 
3.5%

AdditionalMedicalHistory_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct7
Distinct (%)0.3%
Missing311
Missing (%)11.4%
Memory size21.5 KiB
Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.
2106 
Site of resection included spleen, kidney, distal pancreatectomy and retroperitoneal tumor
 
128
History of low-grade papillary urothelial carcinoma <1year prior
 
98
Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.
 
84
Concurrent malignancy - SDHB gene mutation associated paraganglioma
 
2
Other values (2)
 
4

Length

Max length155
Median length102
Mean length101.5673
Min length33

Characters and Unicode

Total characters245996
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFinal Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.
2nd rowFinal Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.
3rd rowFinal Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.
4th rowFinal Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.
5th rowFinal Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.

Common Values

ValueCountFrequency (%)
Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis. 2106
77.1%
Site of resection included spleen, kidney, distal pancreatectomy and retroperitoneal tumor 128
 
4.7%
History of low-grade papillary urothelial carcinoma <1year prior 98
 
3.6%
Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma. 84
 
3.1%
Concurrent malignancy - SDHB gene mutation associated paraganglioma 2
 
0.1%
Final Pathology: Dx confirmed with treatment effect present; 50% viable tumor remains 2
 
0.1%
Prior Malignancy: Prostate Cancer 2
 
0.1%
(Missing) 311
 
11.4%

Length

2025-07-15T01:50:37.103787image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:37.236791image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
final 2192
 
6.6%
pathology 2192
 
6.6%
dx 2192
 
6.6%
confirmed 2192
 
6.6%
mitotic 2190
 
6.6%
treatment 2108
 
6.3%
effect 2108
 
6.3%
present 2108
 
6.3%
is 2106
 
6.3%
hpf 2106
 
6.3%
Other values (49) 11826
35.5%

Most occurring characters

ValueCountFrequency (%)
31068
12.6%
e 23676
 
9.6%
t 23484
 
9.5%
r 14800
 
6.0%
i 14756
 
6.0%
n 14332
 
5.8%
a 12816
 
5.2%
o 12746
 
5.2%
c 11840
 
4.8%
s 9420
 
3.8%
Other values (39) 77058
31.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 245996
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
31068
12.6%
e 23676
 
9.6%
t 23484
 
9.5%
r 14800
 
6.0%
i 14756
 
6.0%
n 14332
 
5.8%
a 12816
 
5.2%
o 12746
 
5.2%
c 11840
 
4.8%
s 9420
 
3.8%
Other values (39) 77058
31.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 245996
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
31068
12.6%
e 23676
 
9.6%
t 23484
 
9.5%
r 14800
 
6.0%
i 14756
 
6.0%
n 14332
 
5.8%
a 12816
 
5.2%
o 12746
 
5.2%
c 11840
 
4.8%
s 9420
 
3.8%
Other values (39) 77058
31.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 245996
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
31068
12.6%
e 23676
 
9.6%
t 23484
 
9.5%
r 14800
 
6.0%
i 14756
 
6.0%
n 14332
 
5.8%
a 12816
 
5.2%
o 12746
 
5.2%
c 11840
 
4.8%
s 9420
 
3.8%
Other values (39) 77058
31.3%
Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Minimum2007-01-01 00:00:00
Maximum2020-07-01 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-07-15T01:50:37.450666image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:37.579361image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

Ethnicity_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Not Hispanic or Latino
2730 
Not Provided
 
3

Length

Max length22
Median length22
Mean length21.989023
Min length12

Characters and Unicode

Total characters60096
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot Hispanic or Latino
2nd rowNot Hispanic or Latino
3rd rowNot Hispanic or Latino
4th rowNot Hispanic or Latino
5th rowNot Hispanic or Latino

Common Values

ValueCountFrequency (%)
Not Hispanic or Latino 2730
99.9%
Not Provided 3
 
0.1%

Length

2025-07-15T01:50:37.739147image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:37.838173image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
not 2733
25.0%
hispanic 2730
25.0%
or 2730
25.0%
latino 2730
25.0%
provided 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 60096
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 60096
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 60096
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 8196
13.6%
i 8193
13.6%
8193
13.6%
t 5463
9.1%
a 5460
9.1%
n 5460
9.1%
N 2733
 
4.5%
r 2733
 
4.5%
H 2730
 
4.5%
p 2730
 
4.5%
Other values (7) 8205
13.7%

Grade StageInformation_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Grade, TNM (Clinical)
2106 
None Provided
417 
Grade, TNM
 
206
Grade, TNM (Pathological)
 
2
TNM (Pathological)
 
2

Length

Max length25
Median length21
Mean length18.95097
Min length10

Characters and Unicode

Total characters51793
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGrade, TNM (Clinical)
2nd rowGrade, TNM (Clinical)
3rd rowGrade, TNM (Clinical)
4th rowGrade, TNM (Clinical)
5th rowGrade, TNM (Clinical)

Common Values

ValueCountFrequency (%)
Grade, TNM (Clinical) 2106
77.1%
None Provided 417
 
15.3%
Grade, TNM 206
 
7.5%
Grade, TNM (Pathological) 2
 
0.1%
TNM (Pathological) 2
 
0.1%

Length

2025-07-15T01:50:37.969794image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:38.109195image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
tnm 2316
30.6%
grade 2314
30.6%
clinical 2106
27.8%
none 417
 
5.5%
provided 417
 
5.5%
pathological 4
 
0.1%

Most occurring characters

ValueCountFrequency (%)
4841
 
9.3%
i 4633
 
8.9%
a 4428
 
8.5%
l 4220
 
8.1%
d 3148
 
6.1%
e 3148
 
6.1%
N 2733
 
5.3%
r 2731
 
5.3%
n 2523
 
4.9%
T 2316
 
4.5%
Other values (13) 17072
33.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 51793
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4841
 
9.3%
i 4633
 
8.9%
a 4428
 
8.5%
l 4220
 
8.1%
d 3148
 
6.1%
e 3148
 
6.1%
N 2733
 
5.3%
r 2731
 
5.3%
n 2523
 
4.9%
T 2316
 
4.5%
Other values (13) 17072
33.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 51793
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4841
 
9.3%
i 4633
 
8.9%
a 4428
 
8.5%
l 4220
 
8.1%
d 3148
 
6.1%
e 3148
 
6.1%
N 2733
 
5.3%
r 2731
 
5.3%
n 2523
 
4.9%
T 2316
 
4.5%
Other values (13) 17072
33.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 51793
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4841
 
9.3%
i 4633
 
8.9%
a 4428
 
8.5%
l 4220
 
8.1%
d 3148
 
6.1%
e 3148
 
6.1%
N 2733
 
5.3%
r 2731
 
5.3%
n 2523
 
4.9%
T 2316
 
4.5%
Other values (13) 17072
33.0%
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Yes
2416 
Not Reported
317 

Length

Max length12
Median length3
Mean length4.0439078
Min length3

Characters and Unicode

Total characters11052
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes
2nd rowYes
3rd rowYes
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
Yes 2416
88.4%
Not Reported 317
 
11.6%

Length

2025-07-15T01:50:38.279834image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:38.364746image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
yes 2416
79.2%
not 317
 
10.4%
reported 317
 
10.4%

Most occurring characters

ValueCountFrequency (%)
e 3050
27.6%
Y 2416
21.9%
s 2416
21.9%
o 634
 
5.7%
t 634
 
5.7%
N 317
 
2.9%
317
 
2.9%
R 317
 
2.9%
p 317
 
2.9%
r 317
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11052
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3050
27.6%
Y 2416
21.9%
s 2416
21.9%
o 634
 
5.7%
t 634
 
5.7%
N 317
 
2.9%
317
 
2.9%
R 317
 
2.9%
p 317
 
2.9%
r 317
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11052
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3050
27.6%
Y 2416
21.9%
s 2416
21.9%
o 634
 
5.7%
t 634
 
5.7%
N 317
 
2.9%
317
 
2.9%
R 317
 
2.9%
p 317
 
2.9%
r 317
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11052
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3050
27.6%
Y 2416
21.9%
s 2416
21.9%
o 634
 
5.7%
t 634
 
5.7%
N 317
 
2.9%
317
 
2.9%
R 317
 
2.9%
p 317
 
2.9%
r 317
 
2.9%

Has Smoked100 Cigarettes_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
No
2514 
Yes
 
216
Not Provided
 
3

Length

Max length12
Median length2
Mean length2.090011
Min length2

Characters and Unicode

Total characters5712
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 2514
92.0%
Yes 216
 
7.9%
Not Provided 3
 
0.1%

Length

2025-07-15T01:50:38.478182image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:38.567202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 2514
91.9%
yes 216
 
7.9%
not 3
 
0.1%
provided 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o 2520
44.1%
N 2517
44.1%
e 219
 
3.8%
Y 216
 
3.8%
s 216
 
3.8%
d 6
 
0.1%
3
 
0.1%
t 3
 
0.1%
P 3
 
0.1%
r 3
 
0.1%
Other values (2) 6
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5712
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 2520
44.1%
N 2517
44.1%
e 219
 
3.8%
Y 216
 
3.8%
s 216
 
3.8%
d 6
 
0.1%
3
 
0.1%
t 3
 
0.1%
P 3
 
0.1%
r 3
 
0.1%
Other values (2) 6
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5712
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 2520
44.1%
N 2517
44.1%
e 219
 
3.8%
Y 216
 
3.8%
s 216
 
3.8%
d 6
 
0.1%
3
 
0.1%
t 3
 
0.1%
P 3
 
0.1%
r 3
 
0.1%
Other values (2) 6
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5712
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 2520
44.1%
N 2517
44.1%
e 219
 
3.8%
Y 216
 
3.8%
s 216
 
3.8%
d 6
 
0.1%
3
 
0.1%
t 3
 
0.1%
P 3
 
0.1%
r 3
 
0.1%
Other values (2) 6
 
0.1%

InferredAncestry_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
European (CEU)
2490 
West African (YRI)
 
236
Not Applicable
 
4
Mixed (All < 80%)
 
3

Length

Max length18
Median length14
Mean length14.348701
Min length14

Characters and Unicode

Total characters39215
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEuropean (CEU)
2nd rowEuropean (CEU)
3rd rowEuropean (CEU)
4th rowEuropean (CEU)
5th rowEuropean (CEU)

Common Values

ValueCountFrequency (%)
European (CEU) 2490
91.1%
West African (YRI) 236
 
8.6%
Not Applicable 4
 
0.1%
Mixed (All < 80%) 3
 
0.1%

Length

2025-07-15T01:50:38.690816image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:38.799597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
european 2490
43.6%
ceu 2490
43.6%
west 236
 
4.1%
african 236
 
4.1%
yri 236
 
4.1%
not 4
 
0.1%
applicable 4
 
0.1%
mixed 3
 
0.1%
all 3
 
0.1%
3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 39215
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 39215
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 39215
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 4980
12.7%
2975
 
7.6%
e 2733
 
7.0%
a 2730
 
7.0%
( 2729
 
7.0%
) 2729
 
7.0%
r 2726
 
7.0%
n 2726
 
7.0%
p 2498
 
6.4%
o 2494
 
6.4%
Other values (23) 9895
25.2%

Molecular andIHC Data_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct5
Distinct (%)0.2%
Missing335
Missing (%)12.3%
Memory size21.5 KiB
IHC: DOG1+, CD117+; S100-, CK AE1/AE3-
2106 
Biomarkers: c-KIT mutated, DOG1+
 
108
IHC (from 10/2013 primary diagnosis): CKIT+, CD34+, BCL2+, CDX2 -, CK7 -, CK20 -, Melan-A -, S-100 -, Vimentin -. PDGFRA exon 12 &18 mutation negative.
 
98
IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).
 
84
c.445C>T (p.Q149*) variant in the SDHB gene
 
2

Length

Max length152
Median length38
Mean length44.189324
Min length33

Characters and Unicode

Total characters105966
Distinct characters64
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIHC: DOG1+, CD117+; S100-, CK AE1/AE3-
2nd rowIHC: DOG1+, CD117+; S100-, CK AE1/AE3-
3rd rowIHC: DOG1+, CD117+; S100-, CK AE1/AE3-
4th rowIHC: DOG1+, CD117+; S100-, CK AE1/AE3-
5th rowIHC: DOG1+, CD117+; S100-, CK AE1/AE3-

Common Values

ValueCountFrequency (%)
IHC: DOG1+, CD117+; S100-, CK AE1/AE3- 2106
77.1%
Biomarkers: c-KIT mutated, DOG1+ 108
 
4.0%
IHC (from 10/2013 primary diagnosis): CKIT+, CD34+, BCL2+, CDX2 -, CK7 -, CK20 -, Melan-A -, S-100 -, Vimentin -. PDGFRA exon 12 &18 mutation negative. 98
 
3.6%
IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST). 84
 
3.1%
c.445C>T (p.Q149*) variant in the SDHB gene 2
 
0.1%
(Missing) 335
 
12.3%

Length

2025-07-15T01:50:38.945084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:39.067676image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
dog1 2298
13.8%
ihc 2288
13.8%
cd117 2190
13.2%
s100 2106
12.7%
ck 2106
12.7%
ae1/ae3 2106
12.7%
588
 
3.5%
c-kit 108
 
0.6%
biomarkers 108
 
0.6%
mutated 108
 
0.6%
Other values (35) 2632
15.8%

Most occurring characters

ValueCountFrequency (%)
14530
 
13.7%
1 11382
 
10.7%
C 7174
 
6.8%
, 5272
 
5.0%
- 5104
 
4.8%
D 4868
 
4.6%
+ 4866
 
4.6%
0 4702
 
4.4%
A 4408
 
4.2%
E 4212
 
4.0%
Other values (54) 39448
37.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 105966
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
14530
 
13.7%
1 11382
 
10.7%
C 7174
 
6.8%
, 5272
 
5.0%
- 5104
 
4.8%
D 4868
 
4.6%
+ 4866
 
4.6%
0 4702
 
4.4%
A 4408
 
4.2%
E 4212
 
4.0%
Other values (54) 39448
37.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 105966
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
14530
 
13.7%
1 11382
 
10.7%
C 7174
 
6.8%
, 5272
 
5.0%
- 5104
 
4.8%
D 4868
 
4.6%
+ 4866
 
4.6%
0 4702
 
4.4%
A 4408
 
4.2%
E 4212
 
4.0%
Other values (54) 39448
37.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 105966
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
14530
 
13.7%
1 11382
 
10.7%
C 7174
 
6.8%
, 5272
 
5.0%
- 5104
 
4.8%
D 4868
 
4.6%
+ 4866
 
4.6%
0 4702
 
4.4%
A 4408
 
4.2%
E 4212
 
4.0%
Other values (54) 39448
37.2%

Occupation_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct9
Distinct (%)0.3%
Missing3
Missing (%)0.1%
Memory size21.5 KiB
Engineer
2106 
Park Service
 
200
Disabled
 
128
Not Provided
 
108
Unknown
 
98
Other values (4)
 
90

Length

Max length39
Median length8
Mean length8.9531136
Min length7

Characters and Unicode

Total characters24442
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEngineer
2nd rowEngineer
3rd rowEngineer
4th rowEngineer
5th rowEngineer

Common Values

ValueCountFrequency (%)
Engineer 2106
77.1%
Park Service 200
 
7.3%
Disabled 128
 
4.7%
Not Provided 108
 
4.0%
Unknown 98
 
3.6%
Plastics factory worker 84
 
3.1%
insurance industry 2
 
0.1%
Department Manger 2
 
0.1%
Retired; prior occupation not provided 2
 
0.1%
(Missing) 3
 
0.1%

Length

2025-07-15T01:50:39.277280image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:39.422275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
engineer 2106
65.4%
park 200
 
6.2%
service 200
 
6.2%
disabled 128
 
4.0%
not 110
 
3.4%
provided 110
 
3.4%
unknown 98
 
3.0%
plastics 84
 
2.6%
factory 84
 
2.6%
worker 84
 
2.6%
Other values (7) 14
 
0.4%

Most occurring characters

ValueCountFrequency (%)
e 4946
20.2%
n 4520
18.5%
r 2882
11.8%
i 2638
10.8%
g 2108
8.6%
E 2106
8.6%
598
 
2.4%
a 504
 
2.1%
o 492
 
2.0%
P 392
 
1.6%
Other values (21) 3256
13.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 24442
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 4946
20.2%
n 4520
18.5%
r 2882
11.8%
i 2638
10.8%
g 2108
8.6%
E 2106
8.6%
598
 
2.4%
a 504
 
2.1%
o 492
 
2.0%
P 392
 
1.6%
Other values (21) 3256
13.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 24442
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 4946
20.2%
n 4520
18.5%
r 2882
11.8%
i 2638
10.8%
g 2108
8.6%
E 2106
8.6%
598
 
2.4%
a 504
 
2.1%
o 492
 
2.0%
P 392
 
1.6%
Other values (21) 3256
13.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 24442
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 4946
20.2%
n 4520
18.5%
r 2882
11.8%
i 2638
10.8%
g 2108
8.6%
E 2106
8.6%
598
 
2.4%
a 504
 
2.1%
o 492
 
2.0%
P 392
 
1.6%
Other values (21) 3256
13.3%

PatientNotes_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct7
Distinct (%)0.3%
Missing331
Missing (%)12.1%
Memory size21.5 KiB
Tumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum
2106 
Disease recurrence at distant site Tumor Grade/Stage: Grade 2, T2N0M1
 
108
Tumor Grade/Stage: High grade, T4NXM1
 
98
Location of know metastases: liver
 
84
Pt has concurrent malignancies - GIST and SDHB gene mutation associated paraganglioma
 
2
Other values (2)
 
4

Length

Max length102
Median length102
Mean length95.62448
Min length26

Characters and Unicode

Total characters229690
Distinct characters48
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum
2nd rowTumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum
3rd rowTumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum
4th rowTumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum
5th rowTumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum

Common Values

ValueCountFrequency (%)
Tumor Grade/Stage: cM0 (at diagnosis); high grade Location of known metastasis: liver, jejunum, ileum 2106
77.1%
Disease recurrence at distant site Tumor Grade/Stage: Grade 2, T2N0M1 108
 
4.0%
Tumor Grade/Stage: High grade, T4NXM1 98
 
3.6%
Location of know metastases: liver 84
 
3.1%
Pt has concurrent malignancies - GIST and SDHB gene mutation associated paraganglioma 2
 
0.1%
Tumor Stage/Grade: G1 - 1 low grade; pT3 pN0 pM not applicable 2
 
0.1%
Tumor Grade/Stage: pT2NX 2
 
0.1%
(Missing) 331
 
12.1%

Length

2025-07-15T01:50:39.632499image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:39.744610image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
tumor 2316
 
7.3%
grade/stage 2314
 
7.3%
grade 2314
 
7.3%
at 2214
 
7.0%
high 2204
 
7.0%
liver 2190
 
6.9%
of 2190
 
6.9%
location 2190
 
6.9%
cm0 2106
 
6.7%
diagnosis 2106
 
6.7%
Other values (35) 9384
29.8%

Most occurring characters

ValueCountFrequency (%)
29020
 
12.6%
a 18078
 
7.9%
e 16282
 
7.1%
i 15344
 
6.7%
o 13194
 
5.7%
t 11436
 
5.0%
s 11222
 
4.9%
n 10932
 
4.8%
r 9466
 
4.1%
g 8840
 
3.8%
Other values (38) 85876
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 229690
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
29020
 
12.6%
a 18078
 
7.9%
e 16282
 
7.1%
i 15344
 
6.7%
o 13194
 
5.7%
t 11436
 
5.0%
s 11222
 
4.9%
n 10932
 
4.8%
r 9466
 
4.1%
g 8840
 
3.8%
Other values (38) 85876
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 229690
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
29020
 
12.6%
a 18078
 
7.9%
e 16282
 
7.1%
i 15344
 
6.7%
o 13194
 
5.7%
t 11436
 
5.0%
s 11222
 
4.9%
n 10932
 
4.8%
r 9466
 
4.1%
g 8840
 
3.8%
Other values (38) 85876
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 229690
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
29020
 
12.6%
a 18078
 
7.9%
e 16282
 
7.1%
i 15344
 
6.7%
o 13194
 
5.7%
t 11436
 
5.0%
s 11222
 
4.9%
n 10932
 
4.8%
r 9466
 
4.1%
g 8840
 
3.8%
Other values (38) 85876
37.4%

Race_df_pinfo_paciente_specimens
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
White
2494 
Black or African American
 
236
Not Provided
 
3

Length

Max length25
Median length5
Mean length6.7347237
Min length5

Characters and Unicode

Total characters18406
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWhite
2nd rowWhite
3rd rowWhite
4th rowWhite
5th rowWhite

Common Values

ValueCountFrequency (%)
White 2494
91.3%
Black or African American 236
 
8.6%
Not Provided 3
 
0.1%

Length

2025-07-15T01:50:39.886397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:39.955265image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
white 2494
72.4%
black 236
 
6.9%
or 236
 
6.9%
african 236
 
6.9%
american 236
 
6.9%
not 3
 
0.1%
provided 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 18406
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 18406
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 18406
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 2969
16.1%
e 2733
14.8%
t 2497
13.6%
W 2494
13.5%
h 2494
13.5%
r 711
 
3.9%
711
 
3.9%
c 708
 
3.8%
a 708
 
3.8%
n 472
 
2.6%
Other values (11) 1909
10.4%

Specimen ID
Categorical

High correlation  Imbalance 

Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
319-R
2106 
101-R
 
200
202-R
 
128
196-R
 
108
082-R
 
98
Other values (5)
 
93

Length

Max length6
Median length5
Mean length5.0007318
Min length5

Characters and Unicode

Total characters13667
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row319-R
2nd row319-R
3rd row319-R
4th row319-R
5th row319-R

Common Values

ValueCountFrequency (%)
319-R 2106
77.1%
101-R 200
 
7.3%
202-R 128
 
4.7%
196-R 108
 
4.0%
082-R 98
 
3.6%
013-R 84
 
3.1%
008-R 3
 
0.1%
235-R1 2
 
0.1%
109-R 2
 
0.1%
194-R 2
 
0.1%

Length

2025-07-15T01:50:40.046378image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:40.134793image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
319-r 2106
77.1%
101-r 200
 
7.3%
202-r 128
 
4.7%
196-r 108
 
4.0%
082-r 98
 
3.6%
013-r 84
 
3.1%
008-r 3
 
0.1%
235-r1 2
 
0.1%
109-r 2
 
0.1%
194-r 2
 
0.1%

Most occurring characters

ValueCountFrequency (%)
- 2733
20.0%
R 2733
20.0%
1 2704
19.8%
9 2218
16.2%
3 2192
16.0%
0 518
 
3.8%
2 356
 
2.6%
6 108
 
0.8%
8 101
 
0.7%
5 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 13667
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 2733
20.0%
R 2733
20.0%
1 2704
19.8%
9 2218
16.2%
3 2192
16.0%
0 518
 
3.8%
2 356
 
2.6%
6 108
 
0.8%
8 101
 
0.7%
5 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 13667
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 2733
20.0%
R 2733
20.0%
1 2704
19.8%
9 2218
16.2%
3 2192
16.0%
0 518
 
3.8%
2 356
 
2.6%
6 108
 
0.8%
8 101
 
0.7%
5 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 13667
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 2733
20.0%
R 2733
20.0%
1 2704
19.8%
9 2218
16.2%
3 2192
16.0%
0 518
 
3.8%
2 356
 
2.6%
6 108
 
0.8%
8 101
 
0.7%
5 2
 
< 0.1%

BiopsySite_specimen_specimens
Categorical

High correlation  Imbalance 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Liver [central]
2106 
Gastric Fundus
 
200
Abdominal Mass
 
128
Stomach [distal]
 
108
abdominal mass
 
98
Other values (4)
 
93

Length

Max length16
Median length15
Mean length14.654592
Min length7

Characters and Unicode

Total characters40051
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLiver [central]
2nd rowLiver [central]
3rd rowLiver [central]
4th rowLiver [central]
5th rowLiver [central]

Common Values

ValueCountFrequency (%)
Liver [central] 2106
77.1%
Gastric Fundus 200
 
7.3%
Abdominal Mass 128
 
4.7%
Stomach [distal] 108
 
4.0%
abdominal mass 98
 
3.6%
Stomach 86
 
3.1%
Stomach/Liver 3
 
0.1%
Gastric 2
 
0.1%
Stomach 2
 
0.1%

Length

2025-07-15T01:50:40.593570image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:40.682142image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
liver 2106
39.2%
central 2106
39.2%
abdominal 226
 
4.2%
mass 226
 
4.2%
gastric 202
 
3.8%
fundus 200
 
3.7%
stomach 196
 
3.6%
distal 108
 
2.0%
stomach/liver 3
 
0.1%

Most occurring characters

ValueCountFrequency (%)
r 4417
11.0%
e 4215
 
10.5%
a 3165
 
7.9%
2740
 
6.8%
i 2645
 
6.6%
t 2615
 
6.5%
n 2532
 
6.3%
c 2507
 
6.3%
l 2440
 
6.1%
[ 2214
 
5.5%
Other values (16) 10561
26.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 40051
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 4417
11.0%
e 4215
 
10.5%
a 3165
 
7.9%
2740
 
6.8%
i 2645
 
6.6%
t 2615
 
6.5%
n 2532
 
6.3%
c 2507
 
6.3%
l 2440
 
6.1%
[ 2214
 
5.5%
Other values (16) 10561
26.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 40051
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 4417
11.0%
e 4215
 
10.5%
a 3165
 
7.9%
2740
 
6.8%
i 2645
 
6.6%
t 2615
 
6.5%
n 2532
 
6.3%
c 2507
 
6.3%
l 2440
 
6.1%
[ 2214
 
5.5%
Other values (16) 10561
26.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 40051
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 4417
11.0%
e 4215
 
10.5%
a 3165
 
7.9%
2740
 
6.8%
i 2645
 
6.6%
t 2615
 
6.5%
n 2532
 
6.3%
c 2507
 
6.3%
l 2440
 
6.1%
[ 2214
 
5.5%
Other values (16) 10561
26.4%

DiagnosisSubtype_specimen_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing543
Missing (%)19.9%
Memory size21.5 KiB
epithelioid type
2106 
spindle type
 
84

Length

Max length16
Median length16
Mean length15.846575
Min length12

Characters and Unicode

Total characters34704
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowepithelioid type
2nd rowepithelioid type
3rd rowepithelioid type
4th rowepithelioid type
5th rowepithelioid type

Common Values

ValueCountFrequency (%)
epithelioid type 2106
77.1%
spindle type 84
 
3.1%
(Missing) 543
 
19.9%

Length

2025-07-15T01:50:40.817313image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:40.888000image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
type 2190
50.0%
epithelioid 2106
48.1%
spindle 84
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

PDX GrowthCurve Avail_specimen_specimens
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
True
2727 
False
 
6
ValueCountFrequency (%)
True 2727
99.8%
False 6
 
0.2%
2025-07-15T01:50:40.932527image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Consensus Whole ExomeSequence Avail_specimen_specimens
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
True
2727 
False
 
6
ValueCountFrequency (%)
True 2727
99.8%
False 6
 
0.2%
2025-07-15T01:50:40.973645image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

MSI Status_specimen_specimens
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
MSI-Stable
2727 
Unknown
 
6

Length

Max length10
Median length10
Mean length9.9934138
Min length7

Characters and Unicode

Total characters27312
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMSI-Stable
2nd rowMSI-Stable
3rd rowMSI-Stable
4th rowMSI-Stable
5th rowMSI-Stable

Common Values

ValueCountFrequency (%)
MSI-Stable 2727
99.8%
Unknown 6
 
0.2%

Length

2025-07-15T01:50:41.059784image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:41.128576image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
msi-stable 2727
99.8%
unknown 6
 
0.2%

Most occurring characters

ValueCountFrequency (%)
S 5454
20.0%
M 2727
10.0%
I 2727
10.0%
- 2727
10.0%
t 2727
10.0%
a 2727
10.0%
b 2727
10.0%
l 2727
10.0%
e 2727
10.0%
n 18
 
0.1%
Other values (4) 24
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 27312
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 5454
20.0%
M 2727
10.0%
I 2727
10.0%
- 2727
10.0%
t 2727
10.0%
a 2727
10.0%
b 2727
10.0%
l 2727
10.0%
e 2727
10.0%
n 18
 
0.1%
Other values (4) 24
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 27312
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 5454
20.0%
M 2727
10.0%
I 2727
10.0%
- 2727
10.0%
t 2727
10.0%
a 2727
10.0%
b 2727
10.0%
l 2727
10.0%
e 2727
10.0%
n 18
 
0.1%
Other values (4) 24
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 27312
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 5454
20.0%
M 2727
10.0%
I 2727
10.0%
- 2727
10.0%
t 2727
10.0%
a 2727
10.0%
b 2727
10.0%
l 2727
10.0%
e 2727
10.0%
n 18
 
0.1%
Other values (4) 24
 
0.1%
Distinct10
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Minimum2015-01-01 00:00:00
Maximum2021-11-01 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-07-15T01:50:41.209675image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:41.298864image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

Human PathogenTesting Summary_specimen_specimens
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Negative
2446 
Negative
 
200
Negative
 
84
Negative
 
3

Length

Max length10
Median length8
Mean length8.1061105
Min length8

Characters and Unicode

Total characters22154
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNegative
2nd rowNegative
3rd rowNegative
4th rowNegative
5th rowNegative

Common Values

ValueCountFrequency (%)
Negative 2446
89.5%
Negative 200
 
7.3%
Negative 84
 
3.1%
Negative 3
 
0.1%

Length

2025-07-15T01:50:41.403213image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:41.478468image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
negative 2733
100.0%

Most occurring characters

ValueCountFrequency (%)
e 5466
24.7%
N 2733
12.3%
g 2733
12.3%
a 2733
12.3%
t 2733
12.3%
i 2733
12.3%
v 2733
12.3%
203
 
0.9%
87
 
0.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22154
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 5466
24.7%
N 2733
12.3%
g 2733
12.3%
a 2733
12.3%
t 2733
12.3%
i 2733
12.3%
v 2733
12.3%
203
 
0.9%
87
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22154
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 5466
24.7%
N 2733
12.3%
g 2733
12.3%
a 2733
12.3%
t 2733
12.3%
i 2733
12.3%
v 2733
12.3%
203
 
0.9%
87
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22154
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 5466
24.7%
N 2733
12.3%
g 2733
12.3%
a 2733
12.3%
t 2733
12.3%
i 2733
12.3%
v 2733
12.3%
203
 
0.9%
87
 
0.4%

Able to ViablyPassage in nude mice_specimen_specimens
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Yes
2518 
No
 
108
Unknown
 
107

Length

Max length7
Median length3
Mean length3.1170874
Min length2

Characters and Unicode

Total characters8519
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes
2nd rowYes
3rd rowYes
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
Yes 2518
92.1%
No 108
 
4.0%
Unknown 107
 
3.9%

Length

2025-07-15T01:50:41.566932image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:41.634064image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
yes 2518
92.1%
no 108
 
4.0%
unknown 107
 
3.9%

Most occurring characters

ValueCountFrequency (%)
Y 2518
29.6%
e 2518
29.6%
s 2518
29.6%
n 321
 
3.8%
o 215
 
2.5%
N 108
 
1.3%
U 107
 
1.3%
k 107
 
1.3%
w 107
 
1.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8519
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
Y 2518
29.6%
e 2518
29.6%
s 2518
29.6%
n 321
 
3.8%
o 215
 
2.5%
N 108
 
1.3%
U 107
 
1.3%
k 107
 
1.3%
w 107
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8519
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
Y 2518
29.6%
e 2518
29.6%
s 2518
29.6%
n 321
 
3.8%
o 215
 
2.5%
N 108
 
1.3%
U 107
 
1.3%
k 107
 
1.3%
w 107
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8519
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
Y 2518
29.6%
e 2518
29.6%
s 2518
29.6%
n 321
 
3.8%
o 215
 
2.5%
N 108
 
1.3%
U 107
 
1.3%
k 107
 
1.3%
w 107
 
1.3%

Age atSampling_specimen_specimens
Real number (ℝ)

High correlation 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57.607757
Minimum32
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:41.696859image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile39
Q159
median59
Q359
95-th percentile67
Maximum81
Range49
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.8836151
Coefficient of variation (CV)0.11949111
Kurtosis3.3487549
Mean57.607757
Median Absolute Deviation (MAD)0
Skewness-1.8342077
Sum157442
Variance47.384156
MonotonicityNot monotonic
2025-07-15T01:50:41.785513image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
59 2106
77.1%
39 284
 
10.4%
60 128
 
4.7%
67 108
 
4.0%
69 98
 
3.6%
32 3
 
0.1%
52 2
 
0.1%
36 2
 
0.1%
81 2
 
0.1%
ValueCountFrequency (%)
32 3
 
0.1%
36 2
 
0.1%
39 284
 
10.4%
52 2
 
0.1%
59 2106
77.1%
60 128
 
4.7%
67 108
 
4.0%
69 98
 
3.6%
81 2
 
0.1%
ValueCountFrequency (%)
81 2
 
0.1%
69 98
 
3.6%
67 108
 
4.0%
60 128
 
4.7%
59 2106
77.1%
52 2
 
0.1%
39 284
 
10.4%
36 2
 
0.1%
32 3
 
0.1%

ModelNotes_specimen_specimens
Categorical

High correlation  Imbalance  Missing 

Distinct3
Distinct (%)0.1%
Missing513
Missing (%)18.8%
Memory size21.5 KiB
PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)
2106 
PDX Model Derivation: P0 slow growth; material pooled from Day 300 tumor due to age-related mortality and implanted into P1.
 
108
No PDX Growth
 
6

Length

Max length124
Median length50
Mean length53.502703
Min length14

Characters and Unicode

Total characters118776
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)
2nd rowPDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)
3rd rowPDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)
4th rowPDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)
5th rowPDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)

Common Values

ValueCountFrequency (%)
PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells) 2106
77.1%
PDX Model Derivation: P0 slow growth; material pooled from Day 300 tumor due to age-related mortality and implanted into P1. 108
 
4.0%
No PDX Growth 6
 
0.2%
(Missing) 513
 
18.8%

Length

2025-07-15T01:50:41.875980image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:41.940534image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
pdx 2220
13.1%
ihc/path 2106
12.4%
cd34 2106
12.4%
diffusely 2106
12.4%
sma 2106
12.4%
rare 2106
12.4%
cells 2106
12.4%
growth 114
 
0.7%
model 108
 
0.6%
p0 108
 
0.6%
Other values (17) 1734
10.2%

Most occurring characters

ValueCountFrequency (%)
14706
 
12.4%
e 7290
 
6.1%
l 7074
 
6.0%
a 5184
 
4.4%
r 4974
 
4.2%
D 4542
 
3.8%
P 4542
 
3.8%
s 4320
 
3.6%
f 4320
 
3.6%
) 4212
 
3.5%
Other values (35) 57612
48.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 118776
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
14706
 
12.4%
e 7290
 
6.1%
l 7074
 
6.0%
a 5184
 
4.4%
r 4974
 
4.2%
D 4542
 
3.8%
P 4542
 
3.8%
s 4320
 
3.6%
f 4320
 
3.6%
) 4212
 
3.5%
Other values (35) 57612
48.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 118776
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
14706
 
12.4%
e 7290
 
6.1%
l 7074
 
6.0%
a 5184
 
4.4%
r 4974
 
4.2%
D 4542
 
3.8%
P 4542
 
3.8%
s 4320
 
3.6%
f 4320
 
3.6%
) 4212
 
3.5%
Other values (35) 57612
48.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 118776
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
14706
 
12.4%
e 7290
 
6.1%
l 7074
 
6.0%
a 5184
 
4.4%
r 4974
 
4.2%
D 4542
 
3.8%
P 4542
 
3.8%
s 4320
 
3.6%
f 4320
 
3.6%
) 4212
 
3.5%
Other values (35) 57612
48.5%

ProvidedTissue Origin_specimen_specimens
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size21.5 KiB
Metastatic Site
2106 
Primary
627 

Length

Max length15
Median length15
Mean length13.164654
Min length7

Characters and Unicode

Total characters35979
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMetastatic Site
2nd rowMetastatic Site
3rd rowMetastatic Site
4th rowMetastatic Site
5th rowMetastatic Site

Common Values

ValueCountFrequency (%)
Metastatic Site 2106
77.1%
Primary 627
 
22.9%

Length

2025-07-15T01:50:42.059837image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:42.131011image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
metastatic 2106
43.5%
site 2106
43.5%
primary 627
 
13.0%

Most occurring characters

ValueCountFrequency (%)
t 8424
23.4%
a 4839
13.4%
i 4839
13.4%
e 4212
11.7%
M 2106
 
5.9%
s 2106
 
5.9%
c 2106
 
5.9%
2106
 
5.9%
S 2106
 
5.9%
r 1254
 
3.5%
Other values (3) 1881
 
5.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 35979
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 8424
23.4%
a 4839
13.4%
i 4839
13.4%
e 4212
11.7%
M 2106
 
5.9%
s 2106
 
5.9%
c 2106
 
5.9%
2106
 
5.9%
S 2106
 
5.9%
r 1254
 
3.5%
Other values (3) 1881
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 35979
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 8424
23.4%
a 4839
13.4%
i 4839
13.4%
e 4212
11.7%
M 2106
 
5.9%
s 2106
 
5.9%
c 2106
 
5.9%
2106
 
5.9%
S 2106
 
5.9%
r 1254
 
3.5%
Other values (3) 1881
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 35979
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 8424
23.4%
a 4839
13.4%
i 4839
13.4%
e 4212
11.7%
M 2106
 
5.9%
s 2106
 
5.9%
c 2106
 
5.9%
2106
 
5.9%
S 2106
 
5.9%
r 1254
 
3.5%
Other values (3) 1881
 
5.2%

Sample ID_samples
Categorical

High correlation 

Distinct44
Distinct (%)1.6%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
AH5
234 
AH5Q86T81
234 
AH5Q89
234 
AH6T26
234 
AH6T28
234 
Other values (39)
1554 

Length

Max length19
Median length16
Mean length6.2621145
Min length3

Characters and Unicode

Total characters17058
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAH5
2nd rowAH5
3rd rowAH5
4th rowAH5
5th rowAH5

Common Values

ValueCountFrequency (%)
AH5 234
8.6%
AH5Q86T81 234
8.6%
AH5Q89 234
8.6%
AH6T26 234
8.6%
AH6T28 234
8.6%
AH8 234
8.6%
AH8Q94 234
8.6%
AH9 234
8.6%
Originator 234
8.6%
ORIGINATOR 66
 
2.4%
Other values (34) 552
20.2%

Length

2025-07-15T01:50:42.225267image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
originator 300
11.0%
ah5q86t81 234
8.6%
ah5 234
8.6%
ah5q89 234
8.6%
ah6t26 234
8.6%
ah8 234
8.6%
ah6t28 234
8.6%
ah8q94 234
8.6%
ah9 234
8.6%
vqf 20
 
0.7%
Other values (33) 532
19.5%

Most occurring characters

ValueCountFrequency (%)
H 2072
 
12.1%
A 1986
 
11.6%
8 1570
 
9.2%
6 1112
 
6.5%
Q 1030
 
6.0%
T 848
 
5.0%
5 786
 
4.6%
9 752
 
4.4%
O 546
 
3.2%
2 504
 
3.0%
Other values (31) 5852
34.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 17058
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
H 2072
 
12.1%
A 1986
 
11.6%
8 1570
 
9.2%
6 1112
 
6.5%
Q 1030
 
6.0%
T 848
 
5.0%
5 786
 
4.6%
9 752
 
4.4%
O 546
 
3.2%
2 504
 
3.0%
Other values (31) 5852
34.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 17058
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
H 2072
 
12.1%
A 1986
 
11.6%
8 1570
 
9.2%
6 1112
 
6.5%
Q 1030
 
6.0%
T 848
 
5.0%
5 786
 
4.6%
9 752
 
4.4%
O 546
 
3.2%
2 504
 
3.0%
Other values (31) 5852
34.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 17058
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
H 2072
 
12.1%
A 1986
 
11.6%
8 1570
 
9.2%
6 1112
 
6.5%
Q 1030
 
6.0%
T 848
 
5.0%
5 786
 
4.6%
9 752
 
4.4%
O 546
 
3.2%
2 504
 
3.0%
Other values (31) 5852
34.3%

DiagnosisSubtype_samples
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing543
Missing (%)19.9%
Memory size21.5 KiB
epithelioid type
2106 
spindle type
 
84

Length

Max length16
Median length16
Mean length15.846575
Min length12

Characters and Unicode

Total characters34704
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowepithelioid type
2nd rowepithelioid type
3rd rowepithelioid type
4th rowepithelioid type
5th rowepithelioid type

Common Values

ValueCountFrequency (%)
epithelioid type 2106
77.1%
spindle type 84
 
3.1%
(Missing) 543
 
19.9%

Length

2025-07-15T01:50:42.343974image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:42.410524image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
type 2190
50.0%
epithelioid 2106
48.1%
spindle 84
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Patient/OriginatingSpecimen_samples
Boolean

High correlation 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size5.5 KiB
False
2424 
True
300 
(Missing)
 
9
ValueCountFrequency (%)
False 2424
88.7%
True 300
 
11.0%
(Missing) 9
 
0.3%
2025-07-15T01:50:42.456153image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Passage_samples
Categorical

High correlation  Missing 

Distinct5
Distinct (%)0.2%
Missing309
Missing (%)11.3%
Memory size21.5 KiB
1.0
1138 
0.0
848 
2.0
350 
3.0
 
52
4.0
 
36

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters7272
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 1138
41.6%
0.0 848
31.0%
2.0 350
 
12.8%
3.0 52
 
1.9%
4.0 36
 
1.3%
(Missing) 309
 
11.3%

Length

2025-07-15T01:50:42.536329image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:42.610909image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1.0 1138
46.9%
0.0 848
35.0%
2.0 350
 
14.4%
3.0 52
 
2.1%
4.0 36
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 3272
45.0%
. 2424
33.3%
1 1138
 
15.6%
2 350
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7272
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 3272
45.0%
. 2424
33.3%
1 1138
 
15.6%
2 350
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7272
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 3272
45.0%
. 2424
33.3%
1 1138
 
15.6%
2 350
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7272
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 3272
45.0%
. 2424
33.3%
1 1138
 
15.6%
2 350
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

PathologyAvail_samples
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size5.5 KiB
True
2712 
False
 
12
(Missing)
 
9
ValueCountFrequency (%)
True 2712
99.2%
False 12
 
0.4%
(Missing) 9
 
0.3%
2025-07-15T01:50:42.673901image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size5.5 KiB
True
1886 
False
838 
(Missing)
 
9
ValueCountFrequency (%)
True 1886
69.0%
False 838
30.7%
(Missing) 9
 
0.3%
2025-07-15T01:50:42.718232image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Whole ExomeSequence Avail_samples
Boolean

High correlation 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size5.5 KiB
True
1886 
False
838 
(Missing)
 
9
ValueCountFrequency (%)
True 1886
69.0%
False 838
30.7%
(Missing) 9
 
0.3%
2025-07-15T01:50:42.766563image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

RNASeqAvail_samples
Boolean

High correlation 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size5.5 KiB
True
1874 
False
850 
(Missing)
 
9
ValueCountFrequency (%)
True 1874
68.6%
False 850
31.1%
(Missing) 9
 
0.3%
2025-07-15T01:50:42.813242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

PDM Type_samples
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
PDX
2424 
Patient/Originator Specimen
300 

Length

Max length27
Median length3
Mean length5.6431718
Min length3

Characters and Unicode

Total characters15372
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPDX
2nd rowPDX
3rd rowPDX
4th rowPDX
5th rowPDX

Common Values

ValueCountFrequency (%)
PDX 2424
88.7%
Patient/Originator Specimen 300
 
11.0%
(Missing) 9
 
0.3%

Length

2025-07-15T01:50:42.887568image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:42.948329image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
pdx 2424
80.2%
patient/originator 300
 
9.9%
specimen 300
 
9.9%

Most occurring characters

ValueCountFrequency (%)
P 2724
17.7%
D 2424
15.8%
X 2424
15.8%
i 1200
7.8%
t 900
 
5.9%
e 900
 
5.9%
n 900
 
5.9%
a 600
 
3.9%
r 600
 
3.9%
/ 300
 
2.0%
Other values (8) 2400
15.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 15372
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 2724
17.7%
D 2424
15.8%
X 2424
15.8%
i 1200
7.8%
t 900
 
5.9%
e 900
 
5.9%
n 900
 
5.9%
a 600
 
3.9%
r 600
 
3.9%
/ 300
 
2.0%
Other values (8) 2400
15.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 15372
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 2724
17.7%
D 2424
15.8%
X 2424
15.8%
i 1200
7.8%
t 900
 
5.9%
e 900
 
5.9%
n 900
 
5.9%
a 600
 
3.9%
r 600
 
3.9%
/ 300
 
2.0%
Other values (8) 2400
15.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 15372
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 2724
17.7%
D 2424
15.8%
X 2424
15.8%
i 1200
7.8%
t 900
 
5.9%
e 900
 
5.9%
n 900
 
5.9%
a 600
 
3.9%
r 600
 
3.9%
/ 300
 
2.0%
Other values (8) 2400
15.6%

Sample ID_gene
Categorical

High correlation  Missing 

Distinct6
Distinct (%)0.3%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
AH9
486 
AH5Q86T81
324 
AH6T26
324 
AH5Q89
324 
AH8Q94
324 

Length

Max length9
Median length6
Mean length5.3076923
Min length3

Characters and Unicode

Total characters11178
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAH5Q86T81
2nd rowAH5Q86T81
3rd rowAH5Q86T81
4th rowAH5Q86T81
5th rowAH5Q86T81

Common Values

ValueCountFrequency (%)
AH9 486
17.8%
AH5Q86T81 324
11.9%
AH6T26 324
11.9%
AH5Q89 324
11.9%
AH8Q94 324
11.9%
AH8 324
11.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:43.032766image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:43.114211image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
ah9 486
23.1%
ah5q86t81 324
15.4%
ah6t26 324
15.4%
ah5q89 324
15.4%
ah8q94 324
15.4%
ah8 324
15.4%

Most occurring characters

ValueCountFrequency (%)
A 2106
18.8%
H 2106
18.8%
8 1620
14.5%
9 1134
10.1%
Q 972
8.7%
6 972
8.7%
5 648
 
5.8%
T 648
 
5.8%
1 324
 
2.9%
2 324
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11178
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 2106
18.8%
H 2106
18.8%
8 1620
14.5%
9 1134
10.1%
Q 972
8.7%
6 972
8.7%
5 648
 
5.8%
T 648
 
5.8%
1 324
 
2.9%
2 324
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11178
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 2106
18.8%
H 2106
18.8%
8 1620
14.5%
9 1134
10.1%
Q 972
8.7%
6 972
8.7%
5 648
 
5.8%
T 648
 
5.8%
1 324
 
2.9%
2 324
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11178
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 2106
18.8%
H 2106
18.8%
8 1620
14.5%
9 1134
10.1%
Q 972
8.7%
6 972
8.7%
5 648
 
5.8%
T 648
 
5.8%
1 324
 
2.9%
2 324
 
2.9%

HugoSymbol_gene
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
KIT
1944 
KMT2C
 
162

Length

Max length5
Median length3
Mean length3.1538462
Min length3

Characters and Unicode

Total characters6642
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKIT
2nd rowKIT
3rd rowKIT
4th rowKIT
5th rowKIT

Common Values

ValueCountFrequency (%)
KIT 1944
71.1%
KMT2C 162
 
5.9%
(Missing) 627
 
22.9%

Length

2025-07-15T01:50:43.234402image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:43.316564image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
kit 1944
92.3%
kmt2c 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
K 2106
31.7%
T 2106
31.7%
I 1944
29.3%
M 162
 
2.4%
2 162
 
2.4%
C 162
 
2.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6642
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
K 2106
31.7%
T 2106
31.7%
I 1944
29.3%
M 162
 
2.4%
2 162
 
2.4%
C 162
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6642
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
K 2106
31.7%
T 2106
31.7%
I 1944
29.3%
M 162
 
2.4%
2 162
 
2.4%
C 162
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6642
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
K 2106
31.7%
T 2106
31.7%
I 1944
29.3%
M 162
 
2.4%
2 162
 
2.4%
C 162
 
2.4%

Chr_gene
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
chr4
1944 
chr7
 
162

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8424
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowchr4
2nd rowchr4
3rd rowchr4
4th rowchr4
5th rowchr4

Common Values

ValueCountFrequency (%)
chr4 1944
71.1%
chr7 162
 
5.9%
(Missing) 627
 
22.9%

Length

2025-07-15T01:50:43.389479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:43.449775image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
chr4 1944
92.3%
chr7 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
c 2106
25.0%
h 2106
25.0%
r 2106
25.0%
4 1944
23.1%
7 162
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8424
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
c 2106
25.0%
h 2106
25.0%
r 2106
25.0%
4 1944
23.1%
7 162
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8424
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
c 2106
25.0%
h 2106
25.0%
r 2106
25.0%
4 1944
23.1%
7 162
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8424
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
c 2106
25.0%
h 2106
25.0%
r 2106
25.0%
4 1944
23.1%
7 162
 
1.9%

Chr End_gene
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
55593654.0
972 
55599332.0
972 
151882672.0
162 

Length

Max length11
Median length10
Mean length10.076923
Min length10

Characters and Unicode

Total characters21222
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row55593654.0
2nd row55593654.0
3rd row55593654.0
4th row55593654.0
5th row55593654.0

Common Values

ValueCountFrequency (%)
55593654.0 972
35.6%
55599332.0 972
35.6%
151882672.0 162
 
5.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:43.526402image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:43.598083image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
55593654.0 972
46.2%
55599332.0 972
46.2%
151882672.0 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
5 6966
32.8%
9 2916
13.7%
3 2916
13.7%
0 2106
 
9.9%
. 2106
 
9.9%
2 1296
 
6.1%
6 1134
 
5.3%
4 972
 
4.6%
1 324
 
1.5%
8 324
 
1.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 21222
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5 6966
32.8%
9 2916
13.7%
3 2916
13.7%
0 2106
 
9.9%
. 2106
 
9.9%
2 1296
 
6.1%
6 1134
 
5.3%
4 972
 
4.6%
1 324
 
1.5%
8 324
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 21222
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5 6966
32.8%
9 2916
13.7%
3 2916
13.7%
0 2106
 
9.9%
. 2106
 
9.9%
2 1296
 
6.1%
6 1134
 
5.3%
4 972
 
4.6%
1 324
 
1.5%
8 324
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 21222
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5 6966
32.8%
9 2916
13.7%
3 2916
13.7%
0 2106
 
9.9%
. 2106
 
9.9%
2 1296
 
6.1%
6 1134
 
5.3%
4 972
 
4.6%
1 324
 
1.5%
8 324
 
1.5%

Chr Start_gene
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
55593601.0
972 
55599332.0
972 
151882672.0
162 

Length

Max length11
Median length10
Mean length10.076923
Min length10

Characters and Unicode

Total characters21222
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row55593601.0
2nd row55593601.0
3rd row55593601.0
4th row55593601.0
5th row55593601.0

Common Values

ValueCountFrequency (%)
55593601.0 972
35.6%
55599332.0 972
35.6%
151882672.0 162
 
5.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:43.682332image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:43.746365image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
55593601.0 972
46.2%
55599332.0 972
46.2%
151882672.0 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
5 5994
28.2%
0 3078
14.5%
9 2916
13.7%
3 2916
13.7%
. 2106
 
9.9%
1 1296
 
6.1%
2 1296
 
6.1%
6 1134
 
5.3%
8 324
 
1.5%
7 162
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 21222
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5 5994
28.2%
0 3078
14.5%
9 2916
13.7%
3 2916
13.7%
. 2106
 
9.9%
1 1296
 
6.1%
2 1296
 
6.1%
6 1134
 
5.3%
8 324
 
1.5%
7 162
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 21222
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5 5994
28.2%
0 3078
14.5%
9 2916
13.7%
3 2916
13.7%
. 2106
 
9.9%
1 1296
 
6.1%
2 1296
 
6.1%
6 1134
 
5.3%
8 324
 
1.5%
7 162
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 21222
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5 5994
28.2%
0 3078
14.5%
9 2916
13.7%
3 2916
13.7%
. 2106
 
9.9%
1 1296
 
6.1%
2 1296
 
6.1%
6 1134
 
5.3%
8 324
 
1.5%
7 162
 
0.8%

HGVS ProteinChange_gene
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
p.Q556_T574delinsP
972 
p.D820Y
972 
p.A1685S
162 

Length

Max length18
Median length8
Mean length12.153846
Min length7

Characters and Unicode

Total characters25596
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowp.Q556_T574delinsP
2nd rowp.Q556_T574delinsP
3rd rowp.Q556_T574delinsP
4th rowp.Q556_T574delinsP
5th rowp.Q556_T574delinsP

Common Values

ValueCountFrequency (%)
p.Q556_T574delinsP 972
35.6%
p.D820Y 972
35.6%
p.A1685S 162
 
5.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:43.833697image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:43.898432image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
p.q556_t574delinsp 972
46.2%
p.d820y 972
46.2%
p.a1685s 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
5 3078
 
12.0%
p 2106
 
8.2%
. 2106
 
8.2%
6 1134
 
4.4%
8 1134
 
4.4%
Q 972
 
3.8%
T 972
 
3.8%
_ 972
 
3.8%
7 972
 
3.8%
4 972
 
3.8%
Other values (14) 11178
43.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 25596
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5 3078
 
12.0%
p 2106
 
8.2%
. 2106
 
8.2%
6 1134
 
4.4%
8 1134
 
4.4%
Q 972
 
3.8%
T 972
 
3.8%
_ 972
 
3.8%
7 972
 
3.8%
4 972
 
3.8%
Other values (14) 11178
43.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 25596
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5 3078
 
12.0%
p 2106
 
8.2%
. 2106
 
8.2%
6 1134
 
4.4%
8 1134
 
4.4%
Q 972
 
3.8%
T 972
 
3.8%
_ 972
 
3.8%
7 972
 
3.8%
4 972
 
3.8%
Other values (14) 11178
43.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 25596
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5 3078
 
12.0%
p 2106
 
8.2%
. 2106
 
8.2%
6 1134
 
4.4%
8 1134
 
4.4%
Q 972
 
3.8%
T 972
 
3.8%
_ 972
 
3.8%
7 972
 
3.8%
4 972
 
3.8%
Other values (14) 11178
43.7%

HGVS cDNAChange_gene
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
c.1667_1720del
972 
c.2458G>T
972 
c.5053G>T
162 

Length

Max length14
Median length9
Mean length11.307692
Min length9

Characters and Unicode

Total characters23814
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc.1667_1720del
2nd rowc.1667_1720del
3rd rowc.1667_1720del
4th rowc.1667_1720del
5th rowc.1667_1720del

Common Values

ValueCountFrequency (%)
c.1667_1720del 972
35.6%
c.2458G>T 972
35.6%
c.5053G>T 162
 
5.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:43.987051image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:44.057840image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
c.1667_1720del 972
46.2%
c.2458g>t 972
46.2%
c.5053g>t 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
c 2106
 
8.8%
. 2106
 
8.8%
1 1944
 
8.2%
6 1944
 
8.2%
7 1944
 
8.2%
2 1944
 
8.2%
5 1296
 
5.4%
0 1134
 
4.8%
T 1134
 
4.8%
> 1134
 
4.8%
Other values (8) 7128
29.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23814
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
c 2106
 
8.8%
. 2106
 
8.8%
1 1944
 
8.2%
6 1944
 
8.2%
7 1944
 
8.2%
2 1944
 
8.2%
5 1296
 
5.4%
0 1134
 
4.8%
T 1134
 
4.8%
> 1134
 
4.8%
Other values (8) 7128
29.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23814
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
c 2106
 
8.8%
. 2106
 
8.8%
1 1944
 
8.2%
6 1944
 
8.2%
7 1944
 
8.2%
2 1944
 
8.2%
5 1296
 
5.4%
0 1134
 
4.8%
T 1134
 
4.8%
> 1134
 
4.8%
Other values (8) 7128
29.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23814
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
c 2106
 
8.8%
. 2106
 
8.8%
1 1944
 
8.2%
6 1944
 
8.2%
7 1944
 
8.2%
2 1944
 
8.2%
5 1296
 
5.4%
0 1134
 
4.8%
T 1134
 
4.8%
> 1134
 
4.8%
Other values (8) 7128
29.9%

Total Reads_gene
Real number (ℝ)

High correlation  Missing 

Distinct12
Distinct (%)0.6%
Missing627
Missing (%)22.9%
Infinite0
Infinite (%)0.0%
Mean459.30769
Minimum45
Maximum565
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:44.126945image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum45
5-th percentile45
Q1458
median490
Q3501
95-th percentile565
Maximum565
Range520
Interquartile range (IQR)43

Descriptive statistics

Standard deviation124.48801
Coefficient of variation (CV)0.27103401
Kurtosis6.5267054
Mean459.30769
Median Absolute Deviation (MAD)32
Skewness-2.75089
Sum967302
Variance15497.264
MonotonicityNot monotonic
2025-07-15T01:50:44.212383image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
501 324
11.9%
487 162
 
5.9%
458 162
 
5.9%
441 162
 
5.9%
500 162
 
5.9%
565 162
 
5.9%
540 162
 
5.9%
490 162
 
5.9%
523 162
 
5.9%
480 162
 
5.9%
Other values (2) 324
11.9%
(Missing) 627
22.9%
ValueCountFrequency (%)
45 162
5.9%
440 162
5.9%
441 162
5.9%
458 162
5.9%
480 162
5.9%
487 162
5.9%
490 162
5.9%
500 162
5.9%
501 324
11.9%
523 162
5.9%
ValueCountFrequency (%)
565 162
5.9%
540 162
5.9%
523 162
5.9%
501 324
11.9%
500 162
5.9%
490 162
5.9%
487 162
5.9%
480 162
5.9%
458 162
5.9%
441 162
5.9%

Variant AlleleFrequency_gene
Real number (ℝ)

High correlation  Missing 

Distinct13
Distinct (%)0.6%
Missing627
Missing (%)22.9%
Infinite0
Infinite (%)0.0%
Mean0.38618462
Minimum0.1778
Maximum0.53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:44.304038image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.1778
5-th percentile0.1778
Q10.2991
median0.3204
Q30.5087
95-th percentile0.53
Maximum0.53
Range0.3522
Interquartile range (IQR)0.2096

Descriptive statistics

Standard deviation0.11938254
Coefficient of variation (CV)0.30913333
Kurtosis-1.5065269
Mean0.38618462
Median Absolute Deviation (MAD)0.1426
Skewness-0.076833756
Sum813.3048
Variance0.01425219
MonotonicityNot monotonic
2025-07-15T01:50:44.408012image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0.2813 162
 
5.9%
0.3039 162
 
5.9%
0.5087 162
 
5.9%
0.3174 162
 
5.9%
0.521 162
 
5.9%
0.53 162
 
5.9%
0.2991 162
 
5.9%
0.3204 162
 
5.9%
0.4918 162
 
5.9%
0.5296 162
 
5.9%
Other values (3) 486
17.8%
(Missing) 627
22.9%
ValueCountFrequency (%)
0.1778 162
5.9%
0.2667 162
5.9%
0.2813 162
5.9%
0.2991 162
5.9%
0.3039 162
5.9%
0.3174 162
5.9%
0.3204 162
5.9%
0.4727 162
5.9%
0.4918 162
5.9%
0.5087 162
5.9%
ValueCountFrequency (%)
0.53 162
5.9%
0.5296 162
5.9%
0.521 162
5.9%
0.5087 162
5.9%
0.4918 162
5.9%
0.4727 162
5.9%
0.3204 162
5.9%
0.3174 162
5.9%
0.3039 162
5.9%
0.2991 162
5.9%

VariantClass_gene
Categorical

High correlation  Missing 

Distinct2
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
Missense_Mutation
1134 
In_Frame_Del
972 

Length

Max length17
Median length17
Mean length14.692308
Min length12

Characters and Unicode

Total characters30942
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIn_Frame_Del
2nd rowIn_Frame_Del
3rd rowIn_Frame_Del
4th rowIn_Frame_Del
5th rowIn_Frame_Del

Common Values

ValueCountFrequency (%)
Missense_Mutation 1134
41.5%
In_Frame_Del 972
35.6%
(Missing) 627
22.9%

Length

2025-07-15T01:50:44.508844image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:44.572914image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
missense_mutation 1134
53.8%
in_frame_del 972
46.2%

Most occurring characters

ValueCountFrequency (%)
e 4212
13.6%
s 3402
11.0%
n 3240
10.5%
_ 3078
9.9%
M 2268
 
7.3%
i 2268
 
7.3%
t 2268
 
7.3%
a 2106
 
6.8%
u 1134
 
3.7%
o 1134
 
3.7%
Other values (6) 5832
18.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 30942
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 4212
13.6%
s 3402
11.0%
n 3240
10.5%
_ 3078
9.9%
M 2268
 
7.3%
i 2268
 
7.3%
t 2268
 
7.3%
a 2106
 
6.8%
u 1134
 
3.7%
o 1134
 
3.7%
Other values (6) 5832
18.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 30942
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 4212
13.6%
s 3402
11.0%
n 3240
10.5%
_ 3078
9.9%
M 2268
 
7.3%
i 2268
 
7.3%
t 2268
 
7.3%
a 2106
 
6.8%
u 1134
 
3.7%
o 1134
 
3.7%
Other values (6) 5832
18.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 30942
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 4212
13.6%
s 3402
11.0%
n 3240
10.5%
_ 3078
9.9%
M 2268
 
7.3%
i 2268
 
7.3%
t 2268
 
7.3%
a 2106
 
6.8%
u 1134
 
3.7%
o 1134
 
3.7%
Other values (6) 5832
18.8%

MutationEffect_gene
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
Gain-of-function
972 
Inconclusive
972 
Loss-of-function
162 

Length

Max length16
Median length16
Mean length14.153846
Min length12

Characters and Unicode

Total characters29808
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGain-of-function
2nd rowGain-of-function
3rd rowGain-of-function
4th rowGain-of-function
5th rowGain-of-function

Common Values

ValueCountFrequency (%)
Gain-of-function 972
35.6%
Inconclusive 972
35.6%
Loss-of-function 162
 
5.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:44.661228image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:44.732498image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
gain-of-function 972
46.2%
inconclusive 972
46.2%
loss-of-function 162
 
7.7%

Most occurring characters

ValueCountFrequency (%)
n 5184
17.4%
o 3402
11.4%
i 3078
10.3%
c 3078
10.3%
f 2268
7.6%
- 2268
7.6%
u 2106
7.1%
s 1296
 
4.3%
t 1134
 
3.8%
G 972
 
3.3%
Other values (6) 5022
16.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 29808
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 5184
17.4%
o 3402
11.4%
i 3078
10.3%
c 3078
10.3%
f 2268
7.6%
- 2268
7.6%
u 2106
7.1%
s 1296
 
4.3%
t 1134
 
3.8%
G 972
 
3.3%
Other values (6) 5022
16.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 29808
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 5184
17.4%
o 3402
11.4%
i 3078
10.3%
c 3078
10.3%
f 2268
7.6%
- 2268
7.6%
u 2106
7.1%
s 1296
 
4.3%
t 1134
 
3.8%
G 972
 
3.3%
Other values (6) 5022
16.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 29808
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 5184
17.4%
o 3402
11.4%
i 3078
10.3%
c 3078
10.3%
f 2268
7.6%
- 2268
7.6%
u 2106
7.1%
s 1296
 
4.3%
t 1134
 
3.8%
G 972
 
3.3%
Other values (6) 5022
16.8%

Oncogenicity_gene
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing627
Missing (%)22.9%
Memory size21.5 KiB
Oncogenic
972 
Resistance
972 
Likely Oncogenic
162 

Length

Max length16
Median length10
Mean length10
Min length9

Characters and Unicode

Total characters21060
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOncogenic
2nd rowOncogenic
3rd rowOncogenic
4th rowOncogenic
5th rowOncogenic

Common Values

ValueCountFrequency (%)
Oncogenic 972
35.6%
Resistance 972
35.6%
Likely Oncogenic 162
 
5.9%
(Missing) 627
22.9%

Length

2025-07-15T01:50:44.822829image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:44.891506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
oncogenic 1134
50.0%
resistance 972
42.9%
likely 162
 
7.1%

Most occurring characters

ValueCountFrequency (%)
n 3240
15.4%
c 3240
15.4%
e 3240
15.4%
i 2268
10.8%
s 1944
9.2%
o 1134
 
5.4%
g 1134
 
5.4%
O 1134
 
5.4%
R 972
 
4.6%
t 972
 
4.6%
Other values (6) 1782
8.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 21060
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 3240
15.4%
c 3240
15.4%
e 3240
15.4%
i 2268
10.8%
s 1944
9.2%
o 1134
 
5.4%
g 1134
 
5.4%
O 1134
 
5.4%
R 972
 
4.6%
t 972
 
4.6%
Other values (6) 1782
8.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 21060
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 3240
15.4%
c 3240
15.4%
e 3240
15.4%
i 2268
10.8%
s 1944
9.2%
o 1134
 
5.4%
g 1134
 
5.4%
O 1134
 
5.4%
R 972
 
4.6%
t 972
 
4.6%
Other values (6) 1782
8.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 21060
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 3240
15.4%
c 3240
15.4%
e 3240
15.4%
i 2268
10.8%
s 1944
9.2%
o 1134
 
5.4%
g 1134
 
5.4%
O 1134
 
5.4%
R 972
 
4.6%
t 972
 
4.6%
Other values (6) 1782
8.5%

Existing Variant_gene
Categorical

High correlation  Missing 

Distinct2
Distinct (%)0.2%
Missing1599
Missing (%)58.5%
Memory size21.5 KiB
CM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379
972 
rs145848316,COSM249560,COSM249561
162 

Length

Max length68
Median length68
Mean length63
Min length33

Characters and Unicode

Total characters71442
Distinct characters36
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379
2nd rowCM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379
3rd rowCM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379
4th rowCM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379
5th rowCM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379

Common Values

ValueCountFrequency (%)
CM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379 972
35.6%
rs145848316,COSM249560,COSM249561 162
 
5.9%
(Missing) 1599
58.5%

Length

2025-07-15T01:50:44.978209image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:45.047347image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
cm023092,kinmutbase_kit_dna:g.76151g>t,cosm12710,cosm17947,cosm22379 972
85.7%
rs145848316,cosm249560,cosm249561 162
 
14.3%

Most occurring characters

ValueCountFrequency (%)
1 5346
 
7.5%
M 5184
 
7.3%
2 5184
 
7.3%
7 4860
 
6.8%
, 4212
 
5.9%
C 4212
 
5.9%
O 3240
 
4.5%
9 3240
 
4.5%
S 3240
 
4.5%
0 3078
 
4.3%
Other values (26) 29646
41.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 71442
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 5346
 
7.5%
M 5184
 
7.3%
2 5184
 
7.3%
7 4860
 
6.8%
, 4212
 
5.9%
C 4212
 
5.9%
O 3240
 
4.5%
9 3240
 
4.5%
S 3240
 
4.5%
0 3078
 
4.3%
Other values (26) 29646
41.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 71442
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 5346
 
7.5%
M 5184
 
7.3%
2 5184
 
7.3%
7 4860
 
6.8%
, 4212
 
5.9%
C 4212
 
5.9%
O 3240
 
4.5%
9 3240
 
4.5%
S 3240
 
4.5%
0 3078
 
4.3%
Other values (26) 29646
41.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 71442
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 5346
 
7.5%
M 5184
 
7.3%
2 5184
 
7.3%
7 4860
 
6.8%
, 4212
 
5.9%
C 4212
 
5.9%
O 3240
 
4.5%
9 3240
 
4.5%
S 3240
 
4.5%
0 3078
 
4.3%
Other values (26) 29646
41.5%

DiagnosisSubtype_pathology
Categorical

High correlation  Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing543
Missing (%)19.9%
Memory size21.5 KiB
epithelioid type
2106 
spindle type
 
84

Length

Max length16
Median length16
Mean length15.846575
Min length12

Characters and Unicode

Total characters34704
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowepithelioid type
2nd rowepithelioid type
3rd rowepithelioid type
4th rowepithelioid type
5th rowepithelioid type

Common Values

ValueCountFrequency (%)
epithelioid type 2106
77.1%
spindle type 84
 
3.1%
(Missing) 543
 
19.9%

Length

2025-07-15T01:50:45.144783image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:45.215393image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
type 2190
50.0%
epithelioid 2106
48.1%
spindle 84
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34704
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 6486
18.7%
i 6402
18.4%
p 4380
12.6%
t 4296
12.4%
l 2190
 
6.3%
d 2190
 
6.3%
y 2190
 
6.3%
2190
 
6.3%
o 2106
 
6.1%
h 2106
 
6.1%
Other values (2) 168
 
0.5%

Sample ID_pathology
Categorical

High correlation 

Distinct44
Distinct (%)1.6%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
AH5
234 
AH5Q86T81
234 
AH5Q89
234 
AH6T26
234 
AH6T28
234 
Other values (39)
1554 

Length

Max length19
Median length16
Mean length6.2422907
Min length3

Characters and Unicode

Total characters17004
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAH5
2nd rowAH5Q86T81
3rd rowAH5Q89
4th rowAH6T26
5th rowAH6T28

Common Values

ValueCountFrequency (%)
AH5 234
8.6%
AH5Q86T81 234
8.6%
AH5Q89 234
8.6%
AH6T26 234
8.6%
AH6T28 234
8.6%
AH8 234
8.6%
AH8Q94 234
8.6%
AH9 234
8.6%
Originator 234
8.6%
ORIGINATOR 54
 
2.0%
Other values (34) 564
20.6%

Length

2025-07-15T01:50:45.303164image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
originator 288
10.6%
ah5q86t81 234
8.6%
ah5 234
8.6%
ah5q89 234
8.6%
ah6t26 234
8.6%
ah8 234
8.6%
ah6t28 234
8.6%
ah8q94 234
8.6%
ah9 234
8.6%
vqf 20
 
0.7%
Other values (33) 544
20.0%

Most occurring characters

ValueCountFrequency (%)
H 2072
 
12.2%
A 1976
 
11.6%
8 1572
 
9.2%
6 1114
 
6.6%
Q 1030
 
6.1%
T 836
 
4.9%
5 786
 
4.6%
9 754
 
4.4%
O 522
 
3.1%
2 504
 
3.0%
Other values (31) 5838
34.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 17004
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
H 2072
 
12.2%
A 1976
 
11.6%
8 1572
 
9.2%
6 1114
 
6.6%
Q 1030
 
6.1%
T 836
 
4.9%
5 786
 
4.6%
9 754
 
4.4%
O 522
 
3.1%
2 504
 
3.0%
Other values (31) 5838
34.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 17004
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
H 2072
 
12.2%
A 1976
 
11.6%
8 1572
 
9.2%
6 1114
 
6.6%
Q 1030
 
6.1%
T 836
 
4.9%
5 786
 
4.6%
9 754
 
4.4%
O 522
 
3.1%
2 504
 
3.0%
Other values (31) 5838
34.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 17004
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
H 2072
 
12.2%
A 1976
 
11.6%
8 1572
 
9.2%
6 1114
 
6.6%
Q 1030
 
6.1%
T 836
 
4.9%
5 786
 
4.6%
9 754
 
4.4%
O 522
 
3.1%
2 504
 
3.0%
Other values (31) 5838
34.3%

Patient/OriginatingSpecimen_pathology
Boolean

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size5.5 KiB
False
2436 
True
288 
(Missing)
 
9
ValueCountFrequency (%)
False 2436
89.1%
True 288
 
10.5%
(Missing) 9
 
0.3%
2025-07-15T01:50:45.387654image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Passage_pathology
Categorical

High correlation  Missing 

Distinct5
Distinct (%)0.2%
Missing297
Missing (%)10.9%
Memory size21.5 KiB
1.0
1144 
0.0
852 
2.0
352 
3.0
 
52
4.0
 
36

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters7308
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row2.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 1144
41.9%
0.0 852
31.2%
2.0 352
 
12.9%
3.0 52
 
1.9%
4.0 36
 
1.3%
(Missing) 297
 
10.9%

Length

2025-07-15T01:50:45.464583image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:45.535831image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1.0 1144
47.0%
0.0 852
35.0%
2.0 352
 
14.4%
3.0 52
 
2.1%
4.0 36
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 3288
45.0%
. 2436
33.3%
1 1144
 
15.7%
2 352
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7308
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 3288
45.0%
. 2436
33.3%
1 1144
 
15.7%
2 352
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7308
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 3288
45.0%
. 2436
33.3%
1 1144
 
15.7%
2 352
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7308
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 3288
45.0%
. 2436
33.3%
1 1144
 
15.7%
2 352
 
4.8%
3 52
 
0.7%
4 36
 
0.5%

PDM Type_pathology
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
PDX
2436 
Patient/Originator Specimen
288 

Length

Max length27
Median length3
Mean length5.5374449
Min length3

Characters and Unicode

Total characters15084
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPDX
2nd rowPDX
3rd rowPDX
4th rowPDX
5th rowPDX

Common Values

ValueCountFrequency (%)
PDX 2436
89.1%
Patient/Originator Specimen 288
 
10.5%
(Missing) 9
 
0.3%

Length

2025-07-15T01:50:45.631577image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:45.690495image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
pdx 2436
80.9%
patient/originator 288
 
9.6%
specimen 288
 
9.6%

Most occurring characters

ValueCountFrequency (%)
P 2724
18.1%
D 2436
16.1%
X 2436
16.1%
i 1152
7.6%
t 864
 
5.7%
e 864
 
5.7%
n 864
 
5.7%
a 576
 
3.8%
r 576
 
3.8%
/ 288
 
1.9%
Other values (8) 2304
15.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 15084
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 2724
18.1%
D 2436
16.1%
X 2436
16.1%
i 1152
7.6%
t 864
 
5.7%
e 864
 
5.7%
n 864
 
5.7%
a 576
 
3.8%
r 576
 
3.8%
/ 288
 
1.9%
Other values (8) 2304
15.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 15084
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 2724
18.1%
D 2436
16.1%
X 2436
16.1%
i 1152
7.6%
t 864
 
5.7%
e 864
 
5.7%
n 864
 
5.7%
a 576
 
3.8%
r 576
 
3.8%
/ 288
 
1.9%
Other values (8) 2304
15.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 15084
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 2724
18.1%
D 2436
16.1%
X 2436
16.1%
i 1152
7.6%
t 864
 
5.7%
e 864
 
5.7%
n 864
 
5.7%
a 576
 
3.8%
r 576
 
3.8%
/ 288
 
1.9%
Other values (8) 2304
15.3%

Tumor Grade_pathology
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
<Unknown>
2574 
Low grade or well differentiated
 
90
High grade or poorly differentiated
 
32
Intermediate grade or moderately differentiated
 
28

Length

Max length47
Median length9
Mean length10.455947
Min length9

Characters and Unicode

Total characters28482
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<Unknown>
2nd row<Unknown>
3rd row<Unknown>
4th row<Unknown>
5th row<Unknown>

Common Values

ValueCountFrequency (%)
<Unknown> 2574
94.2%
Low grade or well differentiated 90
 
3.3%
High grade or poorly differentiated 32
 
1.2%
Intermediate grade or moderately differentiated 28
 
1.0%
(Missing) 9
 
0.3%

Length

2025-07-15T01:50:45.779563image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:45.852560image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
unknown 2574
77.4%
grade 150
 
4.5%
differentiated 150
 
4.5%
or 150
 
4.5%
low 90
 
2.7%
well 90
 
2.7%
high 32
 
1.0%
poorly 32
 
1.0%
intermediate 28
 
0.8%
moderately 28
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n 7900
27.7%
o 2906
 
10.2%
w 2754
 
9.7%
< 2574
 
9.0%
k 2574
 
9.0%
U 2574
 
9.0%
> 2574
 
9.0%
e 830
 
2.9%
600
 
2.1%
r 538
 
1.9%
Other values (14) 2658
 
9.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 28482
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 7900
27.7%
o 2906
 
10.2%
w 2754
 
9.7%
< 2574
 
9.0%
k 2574
 
9.0%
U 2574
 
9.0%
> 2574
 
9.0%
e 830
 
2.9%
600
 
2.1%
r 538
 
1.9%
Other values (14) 2658
 
9.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 28482
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 7900
27.7%
o 2906
 
10.2%
w 2754
 
9.7%
< 2574
 
9.0%
k 2574
 
9.0%
U 2574
 
9.0%
> 2574
 
9.0%
e 830
 
2.9%
600
 
2.1%
r 538
 
1.9%
Other values (14) 2658
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 28482
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 7900
27.7%
o 2906
 
10.2%
w 2754
 
9.7%
< 2574
 
9.0%
k 2574
 
9.0%
U 2574
 
9.0%
> 2574
 
9.0%
e 830
 
2.9%
600
 
2.1%
r 538
 
1.9%
Other values (14) 2658
 
9.3%

Tumor Content_pathology
Real number (ℝ)

High correlation 

Distinct9
Distinct (%)0.3%
Missing9
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean86.292217
Minimum40
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:45.927591image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile40
Q185
median90
Q395
95-th percentile95
Maximum100
Range60
Interquartile range (IQR)10

Descriptive statistics

Standard deviation15.178596
Coefficient of variation (CV)0.17589763
Kurtosis4.5816666
Mean86.292217
Median Absolute Deviation (MAD)5
Skewness-2.3913432
Sum235060
Variance230.38979
MonotonicityNot monotonic
2025-07-15T01:50:46.432729image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
95 984
36.0%
90 798
29.2%
85 538
19.7%
40 234
 
8.6%
100 90
 
3.3%
70 34
 
1.2%
60 16
 
0.6%
80 16
 
0.6%
75 14
 
0.5%
(Missing) 9
 
0.3%
ValueCountFrequency (%)
40 234
 
8.6%
60 16
 
0.6%
70 34
 
1.2%
75 14
 
0.5%
80 16
 
0.6%
85 538
19.7%
90 798
29.2%
95 984
36.0%
100 90
 
3.3%
ValueCountFrequency (%)
100 90
 
3.3%
95 984
36.0%
90 798
29.2%
85 538
19.7%
80 16
 
0.6%
75 14
 
0.5%
70 34
 
1.2%
60 16
 
0.6%
40 234
 
8.6%

Necrosis_pathology
Categorical

High correlation  Imbalance 

Distinct5
Distinct (%)0.2%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
0.0
2198 
10.0
234 
50.0
234 
5.0
 
44
15.0
 
14

Length

Max length4
Median length3
Mean length3.1769457
Min length3

Characters and Unicode

Total characters8654
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row10.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 2198
80.4%
10.0 234
 
8.6%
50.0 234
 
8.6%
5.0 44
 
1.6%
15.0 14
 
0.5%
(Missing) 9
 
0.3%

Length

2025-07-15T01:50:46.532359image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:46.609632image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0.0 2198
80.7%
10.0 234
 
8.6%
50.0 234
 
8.6%
5.0 44
 
1.6%
15.0 14
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 5390
62.3%
. 2724
31.5%
5 292
 
3.4%
1 248
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8654
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 5390
62.3%
. 2724
31.5%
5 292
 
3.4%
1 248
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8654
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 5390
62.3%
. 2724
31.5%
5 292
 
3.4%
1 248
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8654
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 5390
62.3%
. 2724
31.5%
5 292
 
3.4%
1 248
 
2.9%

Stromal_pathology
Real number (ℝ)

High correlation  Zeros 

Distinct7
Distinct (%)0.3%
Missing9
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean7.9258443
Minimum0
Maximum40
Zeros218
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size21.5 KiB
2025-07-15T01:50:46.686639image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median5
Q310
95-th percentile15
Maximum40
Range40
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.3325345
Coefficient of variation (CV)0.67280333
Kurtosis9.2886266
Mean7.9258443
Median Absolute Deviation (MAD)5
Skewness2.1530975
Sum21590
Variance28.435924
MonotonicityNot monotonic
2025-07-15T01:50:46.778011image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 1234
45.2%
10 930
34.0%
15 276
 
10.1%
0 218
 
8.0%
30 34
 
1.2%
20 16
 
0.6%
40 16
 
0.6%
(Missing) 9
 
0.3%
ValueCountFrequency (%)
0 218
 
8.0%
5 1234
45.2%
10 930
34.0%
15 276
 
10.1%
20 16
 
0.6%
30 34
 
1.2%
40 16
 
0.6%
ValueCountFrequency (%)
40 16
 
0.6%
30 34
 
1.2%
20 16
 
0.6%
15 276
 
10.1%
10 930
34.0%
5 1234
45.2%
0 218
 
8.0%

Inflammatory Cell_pathology
Categorical

High correlation 

Distinct3
Distinct (%)0.1%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
Absent/No Inflammation
2214 
<Unknown>
342 
1+ (Low)
 
168

Length

Max length22
Median length22
Mean length19.504405
Min length8

Characters and Unicode

Total characters53130
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAbsent/No Inflammation
2nd rowAbsent/No Inflammation
3rd rowAbsent/No Inflammation
4th rowAbsent/No Inflammation
5th rowAbsent/No Inflammation

Common Values

ValueCountFrequency (%)
Absent/No Inflammation 2214
81.0%
<Unknown> 342
 
12.5%
1+ (Low) 168
 
6.1%
(Missing) 9
 
0.3%

Length

2025-07-15T01:50:46.884746image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-07-15T01:50:46.951029image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
absent/no 2214
43.4%
inflammation 2214
43.4%
unknown 342
 
6.7%
1 168
 
3.3%
low 168
 
3.3%

Most occurring characters

ValueCountFrequency (%)
n 7668
14.4%
o 4938
 
9.3%
t 4428
 
8.3%
a 4428
 
8.3%
m 4428
 
8.3%
2382
 
4.5%
s 2214
 
4.2%
e 2214
 
4.2%
b 2214
 
4.2%
A 2214
 
4.2%
Other values (16) 16002
30.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 53130
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 7668
14.4%
o 4938
 
9.3%
t 4428
 
8.3%
a 4428
 
8.3%
m 4428
 
8.3%
2382
 
4.5%
s 2214
 
4.2%
e 2214
 
4.2%
b 2214
 
4.2%
A 2214
 
4.2%
Other values (16) 16002
30.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 53130
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 7668
14.4%
o 4938
 
9.3%
t 4428
 
8.3%
a 4428
 
8.3%
m 4428
 
8.3%
2382
 
4.5%
s 2214
 
4.2%
e 2214
 
4.2%
b 2214
 
4.2%
A 2214
 
4.2%
Other values (16) 16002
30.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 53130
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 7668
14.4%
o 4938
 
9.3%
t 4428
 
8.3%
a 4428
 
8.3%
m 4428
 
8.3%
2382
 
4.5%
s 2214
 
4.2%
e 2214
 
4.2%
b 2214
 
4.2%
A 2214
 
4.2%
Other values (16) 16002
30.1%

Pathology Notes_pathology
Categorical

High correlation 

Distinct21
Distinct (%)0.8%
Missing9
Missing (%)0.3%
Memory size21.5 KiB
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
702 
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34 in another passage. Conventional grading of GIST is not done.
468 
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. A previous passage of the tumor was positive for CD34. Conventional grading of GIST is not done.
234 
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
234 
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The previous passage of the tumor was positive for CD34. Conventional grading of GIST is not done.
234 
Other values (16)
852 

Length

Max length658
Median length579
Mean length375.39648
Min length335

Characters and Unicode

Total characters1022580
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
2nd rowGastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. A previous passage of the tumor was positive for CD34. Conventional grading of GIST is not done.
3rd rowGastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The previous passage of the tumor was positive for CD34. Conventional grading of GIST is not done.
4th rowGastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
5th rowGastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.

Common Values

ValueCountFrequency (%)
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done. 702
25.7%
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34 in another passage. Conventional grading of GIST is not done. 468
17.1%
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. A previous passage of the tumor was positive for CD34. Conventional grading of GIST is not done. 234
 
8.6%
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done. 234
 
8.6%
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The previous passage of the tumor was positive for CD34. Conventional grading of GIST is not done. 234
 
8.6%
Gastrointestinal stromal tumor (GIST), epithelioid. The tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. Another passage of the tumor was positive for CD34. Conventional grading of GIST is not done. 234
 
8.6%
Gastrointestinal Stromal Tumors (GIST) The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. The stroma exhibit areas of myxoid change. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. 180
 
6.6%
Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma may exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors. 94
 
3.4%
Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors. 70
 
2.6%
Gastrointestinal stromal tumor (GIST). The tumor is composed of mixture of bland spindle and epithelioid cells that have faintly eosinophilic cytoplasm in a syncytial pattern; the cells have pleomorphic nuclei with inconspicuous nucleoli; occasional palisading pattern are present. Mitosis is difficult to find and is probably rare. The originator tumor was diffusely positive for CD34. 54
 
2.0%
Other values (11) 220
 
8.0%

Length

2025-07-15T01:50:47.063397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
of 8480
 
5.7%
tumor 6862
 
4.6%
is 6726
 
4.5%
the 6452
 
4.3%
cells 6204
 
4.2%
with 4844
 
3.2%
gist 4830
 
3.2%
epithelioid 4320
 
2.9%
to 4284
 
2.9%
for 3384
 
2.3%
Other values (138) 92820
62.2%

Most occurring characters

ValueCountFrequency (%)
152220
14.9%
o 87044
 
8.5%
e 82398
 
8.1%
i 79698
 
7.8%
s 64268
 
6.3%
t 58750
 
5.7%
l 52390
 
5.1%
n 49018
 
4.8%
r 46046
 
4.5%
a 45070
 
4.4%
Other values (41) 305678
29.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1022580
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
152220
14.9%
o 87044
 
8.5%
e 82398
 
8.1%
i 79698
 
7.8%
s 64268
 
6.3%
t 58750
 
5.7%
l 52390
 
5.1%
n 49018
 
4.8%
r 46046
 
4.5%
a 45070
 
4.4%
Other values (41) 305678
29.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1022580
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
152220
14.9%
o 87044
 
8.5%
e 82398
 
8.1%
i 79698
 
7.8%
s 64268
 
6.3%
t 58750
 
5.7%
l 52390
 
5.1%
n 49018
 
4.8%
r 46046
 
4.5%
a 45070
 
4.4%
Other values (41) 305678
29.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1022580
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
152220
14.9%
o 87044
 
8.5%
e 82398
 
8.1%
i 79698
 
7.8%
s 64268
 
6.3%
t 58750
 
5.7%
l 52390
 
5.1%
n 49018
 
4.8%
r 46046
 
4.5%
a 45070
 
4.4%
Other values (41) 305678
29.9%

Interactions

2025-07-15T01:50:29.105718image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.115972image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.962857image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.059762image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.970177image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.649912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.357719image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:29.195624image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.227974image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:25.094860image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.145858image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.045651image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.732806image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.453488image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:29.301429image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.360858image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:25.251034image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.247410image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.146822image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.837157image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.555405image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:29.416345image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.502288image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:25.412745image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.349821image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.256707image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.940876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.675204image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:29.512477image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.609136image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:25.589132image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.447794image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.351925image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.040770image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.771005image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:29.625973image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.722717image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:25.779305image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.570212image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.448288image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.149493image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.878380image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:29.746183image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:24.840803image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:25.944304image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:26.674356image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:27.561032image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.255267image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-07-15T01:50:28.986548image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-07-15T01:50:47.274076image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
%European(CEU)_df_pinfo_paciente_specimens%European(CEU)_df_race_paciente_specimens%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens%Native and LatinAmerican (NA)_df_race_paciente_specimens%West African(YRI)_df_pinfo_paciente_specimens%West African(YRI)_df_race_paciente_specimensAble to ViablyPassage in nude mice_specimen_specimensAdditionalMedicalHistory_df_pinfo_paciente_specimensAge atDiagnosis_df_pinfo_paciente_specimensAge atSampling_specimen_specimensBest Response_trathistory_paciente_specimensBiologicalSex_df_pinfo_paciente_specimensBiopsySite_specimen_specimensChr End_geneChr Start_geneChr_geneConsensus Whole ExomeSequence Avail_specimen_specimensDiagnosisSubtype_df_pinfo_paciente_specimensDiagnosisSubtype_df_race_paciente_specimensDiagnosisSubtype_pathologyDiagnosisSubtype_samplesDiagnosisSubtype_specimen_specimensDiagnosisSubtype_trathistory_paciente_specimensEthnicity_df_pinfo_paciente_specimensExisting Variant_geneGrade StageInformation_df_pinfo_paciente_specimensHGVS ProteinChange_geneHGVS cDNAChange_geneHas KnownMetastaticDisease_df_pinfo_paciente_specimensHas Smoked100 Cigarettes_df_pinfo_paciente_specimensHugoSymbol_geneHuman PathogenTesting Summary_specimen_specimensInferredAncestry_df_pinfo_paciente_specimensInferredAncestry_df_race_paciente_specimensInflammatory Cell_pathologyMSI Status_specimen_specimensModelNotes_specimen_specimensMolecular andIHC Data_df_pinfo_paciente_specimensMutationEffect_geneNecrosis_pathologyOccupation_df_pinfo_paciente_specimensOncoKB Cancer GenePanel Data Avail_samplesOncogenicity_genePDM Type_pathologyPDM Type_samplesPDX GrowthCurve Avail_specimen_specimensPassage_pathologyPassage_samplesPathology Notes_pathologyPathologyAvail_samplesPatient IDPatient/OriginatingSpecimen_pathologyPatient/OriginatingSpecimen_samplesPatientNotes_df_pinfo_paciente_specimensProvidedTissue Origin_specimen_specimensRNASeqAvail_samplesRace_df_pinfo_paciente_specimensSample ID_geneSample ID_pathologySample ID_samplesSelf-ReportedEthnicity_df_race_paciente_specimensSelf-ReportedRace_df_race_paciente_specimensSpecimen IDStandardizedRegimen_trathistory_paciente_specimensStromal_pathologyTiming_trathistory_paciente_specimensTotal Reads_geneTumor Content_pathologyTumor Grade_pathologyVariant AlleleFrequency_geneVariantClass_geneWhole ExomeSequence Avail_samples
%European(CEU)_df_pinfo_paciente_specimens1.0001.0001.0001.0000.9900.9900.7040.9990.7330.6100.8050.9860.9991.0001.0001.0000.1441.0001.0001.0001.0001.0001.0001.0001.0000.5091.0001.0000.5800.8861.0000.5800.8200.8200.4170.1440.9930.9991.0000.1620.9990.1071.0000.0340.0280.1440.4680.4680.9520.0000.9990.0340.0280.9990.5720.1090.9951.0000.9400.9351.0000.9950.9990.4660.5490.0551.0000.3780.6791.0001.0000.107
%European(CEU)_df_race_paciente_specimens1.0001.0001.0001.0000.9900.9900.7040.9990.7330.6100.8050.9860.9991.0001.0001.0000.1441.0001.0001.0001.0001.0001.0001.0001.0000.5091.0001.0000.5800.8861.0000.5800.8200.8200.4170.1440.9930.9991.0000.1620.9990.1071.0000.0340.0280.1440.4680.4680.9520.0000.9990.0340.0280.9990.5720.1090.9951.0000.9400.9351.0000.9950.9990.4660.5490.0551.0000.3780.6791.0001.0000.107
%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens1.0001.0001.0000.8330.0000.0000.1621.0000.7730.7730.9920.0850.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0000.8331.0000.0681.0001.0000.0721.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0000.9991.0001.0001.0000.0441.0001.0001.0001.0001.0000.8331.0000.9990.5781.0000.0001.0001.0001.0001.0001.0001.000
%Native and LatinAmerican (NA)_df_race_paciente_specimens1.0001.0000.8331.0000.0000.0000.1621.0000.7730.7730.9920.0850.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0000.8331.0000.0681.0001.0000.0721.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0000.9991.0001.0001.0000.0441.0001.0001.0001.0001.0000.8331.0000.9990.5781.0000.0001.0001.0001.0001.0001.0001.000
%West African(YRI)_df_pinfo_paciente_specimens0.9900.9900.0000.0001.0001.0000.7080.9990.7030.5050.6960.9880.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0000.0001.0000.6221.0001.0000.5620.5351.0000.0670.7070.7070.4170.0001.0000.9991.0000.1620.9990.1071.0000.0340.0280.0000.4680.4680.9520.0000.9990.0340.0280.9990.5630.1090.7071.0000.9400.9350.0000.7070.9990.3940.5490.0591.0000.3780.6791.0001.0000.107
%West African(YRI)_df_race_paciente_specimens0.9900.9900.0000.0001.0001.0000.7080.9990.7030.5050.6960.9880.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0000.0001.0000.6221.0001.0000.5620.5351.0000.0670.7070.7070.4170.0001.0000.9991.0000.1620.9990.1071.0000.0340.0280.0000.4680.4680.9520.0000.9990.0340.0280.9990.5630.1090.7071.0000.9400.9350.0000.7070.9990.3940.5490.0591.0000.3780.6791.0001.0000.107
Able to ViablyPassage in nude mice_specimen_specimens0.7040.7040.1620.1620.7080.7081.0000.9990.7020.7220.6960.6520.9941.0001.0001.0000.2311.0001.0001.0001.0001.0001.0000.1621.0000.7051.0001.0000.5590.1231.0000.1310.4990.4990.2910.2311.0001.0001.0000.4010.9990.0741.0000.0710.0710.2310.3870.3870.9430.0000.9990.0710.0710.9990.5350.0740.4811.0000.9630.9600.1620.4810.9990.3670.4490.0591.0000.3630.7241.0001.0000.074
AdditionalMedicalHistory_df_pinfo_paciente_specimens0.9990.9991.0001.0000.9990.9990.9991.0001.0001.0000.9960.9991.0001.0001.0001.0000.9990.9940.9940.9940.9940.9940.9941.0001.0001.0001.0001.0000.9990.9991.0000.9990.9990.9990.7790.9991.0001.0001.0000.3541.0000.1661.0000.0910.0660.9990.3320.3320.9180.3711.0000.0910.0661.0000.9990.1260.9991.0000.9950.9721.0000.9991.0000.7060.5580.0001.0000.4620.3761.0001.0000.166
Age atDiagnosis_df_pinfo_paciente_specimens0.7330.7330.7730.7730.7030.7030.7021.0001.0000.3680.8050.7250.9971.0001.0001.0000.6850.9940.9940.9940.9940.9940.9940.7731.0000.7841.0001.0000.6430.8061.0000.7290.7770.7770.7720.6850.7061.0001.0000.3620.9990.1201.0000.0730.0600.6850.2430.2450.9200.192-0.6480.0730.0601.0000.8860.1060.7451.0000.9690.9660.7730.7450.9990.6610.2010.000NaN-0.1030.222NaN1.0000.120
Age atSampling_specimen_specimens0.6100.6100.7730.7730.5050.5050.7221.0000.3681.0000.6620.4580.8951.0001.0001.0000.8930.9940.9940.9940.9940.9940.9940.7731.0000.9551.0001.0000.7770.5881.0000.7290.7070.7070.5740.8931.0001.0001.0000.2970.9990.0681.0000.0350.0000.8930.2740.2760.9500.1930.0600.0350.0001.0000.8650.0380.6321.0000.9650.9600.7730.6320.9990.603-0.0540.019NaN0.0950.559NaN1.0000.068
Best Response_trathistory_paciente_specimens0.8050.8050.9920.9920.6960.6960.6960.9960.8050.6621.0000.4620.7020.0000.0000.0000.0210.0000.0000.0000.0000.0000.0000.9920.0000.4060.0000.0000.9920.8040.0000.9920.7690.7690.7030.0210.0000.9960.0000.2210.7040.0170.0000.1090.1090.0210.3430.3430.6631.0000.7020.1090.1090.7061.0000.0170.7690.0000.5880.5880.9920.7690.7020.9950.4800.9920.0000.4200.6620.0000.0000.017
BiologicalSex_df_pinfo_paciente_specimens0.9860.9860.0850.0850.9880.9880.6520.9990.7250.4580.4621.0000.9941.0001.0001.0000.0191.0001.0001.0001.0001.0001.0000.0851.0000.5971.0001.0000.3400.5421.0000.1460.9950.9950.3890.0190.9930.9991.0000.1810.9990.1061.0000.0310.0250.0190.5890.5900.9460.0000.9990.0310.0250.9990.5680.1090.9951.0000.9620.9530.0850.9950.9990.4300.6210.0381.0000.3980.6311.0001.0000.106
BiopsySite_specimen_specimens0.9990.9990.9990.9990.9990.9990.9941.0000.9970.8950.7020.9941.0001.0001.0001.0000.8190.9940.9940.9940.9940.9940.9940.9991.0000.8681.0001.0000.9950.9991.0000.9950.9990.9990.8510.8191.0001.0001.0000.3171.0000.1671.0000.0900.0670.8190.3540.3540.9370.3711.0000.0900.0671.0000.9990.1330.9991.0000.9660.9520.9990.9991.0000.6860.4540.0371.0000.3940.6131.0001.0000.167
Chr End_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.3700.0000.0001.0001.0001.0000.0000.0000.0000.7260.0001.0000.9991.0000.000
Chr Start_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.3700.0000.0001.0001.0001.0000.0000.0000.0000.7260.0001.0000.9991.0000.000
Chr_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0000.9971.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.5250.0000.0001.0001.0001.0000.0000.0000.0001.0000.0001.0000.9990.2650.000
Consensus Whole ExomeSequence Avail_specimen_specimens0.1440.1440.0000.0000.0000.0000.2310.9990.6850.8930.0210.0190.8191.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.8161.0001.0000.1160.0991.0000.0000.8160.8161.0000.9161.0000.9991.0001.0000.9991.0001.0001.0001.0000.9161.0001.0001.0001.0000.9991.0001.0000.9990.0741.0000.0001.0001.0001.0000.0000.0000.9990.4141.0000.0001.0001.0001.0001.0001.0001.000
DiagnosisSubtype_df_pinfo_paciente_specimens1.0001.0001.0001.0001.0001.0001.0000.9940.9940.9940.0001.0000.9941.0001.0001.0001.0001.0000.9940.9940.9940.9940.9941.0001.0000.9941.0001.0001.0000.9941.0000.9941.0001.0000.9941.0001.0000.9941.0000.1000.9940.1331.0000.0620.0001.0000.0000.0000.9990.3550.9940.0620.0000.9940.9940.0721.0001.0000.9970.9971.0001.0000.9941.0000.8160.0001.0000.2770.3861.0001.0000.133
DiagnosisSubtype_df_race_paciente_specimens1.0001.0001.0001.0001.0001.0001.0000.9940.9940.9940.0001.0000.9941.0001.0001.0001.0000.9941.0000.9940.9940.9940.9941.0001.0000.9941.0001.0001.0000.9941.0000.9941.0001.0000.9941.0001.0000.9941.0000.1000.9940.1331.0000.0620.0001.0000.0000.0000.9990.3550.9940.0620.0000.9940.9940.0721.0001.0000.9970.9971.0001.0000.9941.0000.8160.0001.0000.2770.3861.0001.0000.133
DiagnosisSubtype_pathology1.0001.0001.0001.0001.0001.0001.0000.9940.9940.9940.0001.0000.9941.0001.0001.0001.0000.9940.9941.0000.9940.9940.9941.0001.0000.9941.0001.0001.0000.9941.0000.9941.0001.0000.9941.0001.0000.9941.0000.1000.9940.1331.0000.0620.0001.0000.0000.0000.9990.3550.9940.0620.0000.9940.9940.0721.0001.0000.9970.9971.0001.0000.9941.0000.8160.0001.0000.2770.3861.0001.0000.133
DiagnosisSubtype_samples1.0001.0001.0001.0001.0001.0001.0000.9940.9940.9940.0001.0000.9941.0001.0001.0001.0000.9940.9940.9941.0000.9940.9941.0001.0000.9941.0001.0001.0000.9941.0000.9941.0001.0000.9941.0001.0000.9941.0000.1000.9940.1331.0000.0620.0001.0000.0000.0000.9990.3550.9940.0620.0000.9940.9940.0721.0001.0000.9970.9971.0001.0000.9941.0000.8160.0001.0000.2770.3861.0001.0000.133
DiagnosisSubtype_specimen_specimens1.0001.0001.0001.0001.0001.0001.0000.9940.9940.9940.0001.0000.9941.0001.0001.0001.0000.9940.9940.9940.9941.0000.9941.0001.0000.9941.0001.0001.0000.9941.0000.9941.0001.0000.9941.0001.0000.9941.0000.1000.9940.1331.0000.0620.0001.0000.0000.0000.9990.3550.9940.0620.0000.9940.9940.0721.0001.0000.9970.9971.0001.0000.9941.0000.8160.0001.0000.2770.3861.0001.0000.133
DiagnosisSubtype_trathistory_paciente_specimens1.0001.0001.0001.0001.0001.0001.0000.9940.9940.9940.0001.0000.9941.0001.0001.0001.0000.9940.9940.9940.9940.9941.0001.0001.0000.9941.0001.0001.0000.9941.0000.9941.0001.0000.9941.0001.0000.9941.0000.1000.9940.1331.0000.0620.0001.0000.0000.0000.9990.3550.9940.0620.0000.9940.9940.0721.0001.0000.9970.9971.0001.0000.9941.0000.8160.0001.0000.2770.3861.0001.0000.133
Ethnicity_df_pinfo_paciente_specimens1.0001.0000.8330.8330.0000.0000.1621.0000.7730.7730.9920.0850.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0000.0681.0001.0000.0721.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0000.9991.0001.0001.0000.0441.0001.0001.0001.0001.0000.8331.0000.9990.5781.0000.0001.0001.0001.0001.0001.0001.000
Existing Variant_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0000.9960.9960.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9960.9961.0001.0000.9961.0001.0001.0001.0001.0001.0001.0000.9960.0001.0000.0000.9960.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.6420.0000.0001.0001.0001.0000.0000.0000.0000.9990.0001.0001.0001.0000.000
Grade StageInformation_df_pinfo_paciente_specimens0.5090.5090.0680.0680.6220.6220.7051.0000.7840.9550.4060.5970.8681.0001.0001.0000.8160.9940.9940.9940.9940.9940.9940.0681.0001.0001.0001.0000.6660.4901.0000.4650.5340.5340.6620.8161.0001.0001.0000.3160.9990.1121.0000.0180.0000.8160.3350.3380.9600.1550.9990.0180.0001.0000.9990.0890.4211.0000.9700.9680.0680.4210.9990.5900.4900.0261.0000.4280.5561.0001.0000.112
HGVS ProteinChange_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.3700.0000.0001.0001.0001.0000.0000.0000.0000.7260.0001.0000.9991.0000.000
HGVS cDNAChange_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.3700.0000.0001.0001.0001.0000.0000.0000.0000.7260.0001.0000.9991.0000.000
Has KnownMetastaticDisease_df_pinfo_paciente_specimens0.5800.5800.0720.0720.5620.5620.5590.9990.6430.7770.9920.3400.9951.0001.0001.0000.1161.0001.0001.0001.0001.0001.0000.0721.0000.6661.0001.0001.0000.1241.0000.7820.3570.3570.5970.1161.0000.9991.0000.1700.9990.0361.0000.0000.0000.1160.3380.3380.9670.0000.9990.0000.0000.9990.6620.0400.3401.0000.9710.9620.0720.3400.9990.6460.5440.0311.0000.3470.5431.0001.0000.036
Has Smoked100 Cigarettes_df_pinfo_paciente_specimens0.8860.8861.0001.0000.5350.5350.1230.9990.8060.5880.8040.5420.9991.0001.0001.0000.0990.9940.9940.9940.9940.9940.9941.0001.0000.4901.0001.0000.1241.0001.0000.8280.8010.8010.7660.0990.8160.9991.0000.1830.9990.1431.0000.0180.0000.0990.2640.2750.9340.2180.9990.0180.0000.9990.5400.1100.7991.0000.9630.9501.0000.7990.9990.5390.5300.0001.0000.4230.1681.0001.0000.143
HugoSymbol_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0000.9971.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.5250.0000.0001.0001.0001.0000.0000.0000.0001.0000.0001.0000.9990.2650.000
Human PathogenTesting Summary_specimen_specimens0.5800.5801.0001.0000.0670.0670.1310.9990.7290.7290.9920.1460.9951.0001.0001.0000.0000.9940.9940.9940.9940.9940.9941.0001.0000.4651.0001.0000.7820.8281.0001.0000.5800.5800.6470.0001.0000.9991.0000.1120.9990.1161.0000.0560.0000.0000.1670.1660.9370.3720.9990.0560.0000.9990.6270.0600.7101.0000.9750.9421.0000.7100.9990.6640.3840.0001.0000.3570.2001.0001.0000.116
InferredAncestry_df_pinfo_paciente_specimens0.8200.8201.0001.0000.7070.7070.4990.9990.7770.7070.7690.9950.9991.0001.0001.0000.8161.0001.0001.0001.0001.0001.0001.0001.0000.5341.0001.0000.3570.8011.0000.5801.0001.0000.3890.8160.9131.0001.0000.1810.9990.1061.0000.0310.0250.8160.5890.5900.9460.0000.9990.0310.0250.9990.5720.1091.0001.0000.9620.9531.0001.0000.9990.5040.6210.0291.0000.3980.6311.0001.0000.106
InferredAncestry_df_race_paciente_specimens0.8200.8201.0001.0000.7070.7070.4990.9990.7770.7070.7690.9950.9991.0001.0001.0000.8161.0001.0001.0001.0001.0001.0001.0001.0000.5341.0001.0000.3570.8011.0000.5801.0001.0000.3890.8160.9131.0001.0000.1810.9990.1061.0000.0310.0250.8160.5890.5900.9460.0000.9990.0310.0250.9990.5720.1091.0001.0000.9620.9531.0001.0000.9990.5040.6210.0291.0000.3980.6311.0001.0000.106
Inflammatory Cell_pathology0.4170.4171.0001.0000.4170.4170.2910.7790.7720.5740.7030.3890.8511.0001.0001.0001.0000.9940.9940.9940.9940.9940.9941.0001.0000.6621.0001.0000.5970.7661.0000.6470.3890.3891.0001.0001.0000.7601.0000.3010.8510.0951.0000.0840.0001.0000.2100.2020.8760.1730.8510.0840.0000.7600.8860.0680.3891.0000.9820.8171.0000.3890.8510.6570.4910.0001.0000.4330.1851.0001.0000.095
MSI Status_specimen_specimens0.1440.1440.0000.0000.0000.0000.2310.9990.6850.8930.0210.0190.8191.0001.0001.0000.9161.0001.0001.0001.0001.0001.0000.0001.0000.8161.0001.0000.1160.0991.0000.0000.8160.8161.0001.0001.0000.9991.0001.0000.9991.0001.0001.0001.0000.9161.0001.0001.0001.0000.9991.0001.0000.9990.0741.0000.0001.0001.0001.0000.0000.0000.9990.4141.0000.0001.0001.0001.0001.0001.0001.000
ModelNotes_specimen_specimens0.9930.9931.0001.0001.0001.0001.0001.0000.7061.0000.0000.9931.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.8161.0001.0000.9130.9131.0001.0001.0001.0001.0000.1131.0000.0711.0000.0270.0271.0000.6560.6560.9981.0001.0000.0270.0271.0001.0000.0711.0001.0000.9970.9971.0001.0001.0000.6620.9100.0651.0000.1751.0001.0001.0000.071
Molecular andIHC Data_df_pinfo_paciente_specimens0.9990.9991.0001.0000.9990.9991.0001.0001.0001.0000.9960.9991.0001.0001.0001.0000.9990.9940.9940.9940.9940.9940.9941.0001.0001.0001.0001.0000.9990.9991.0000.9991.0001.0000.7600.9991.0001.0001.0000.3831.0000.1521.0000.0990.0750.9990.3800.3800.9620.3711.0000.0990.0751.0000.9990.1070.9991.0000.9950.9691.0000.9991.0000.7170.5590.0561.0000.3740.6131.0001.0000.152
MutationEffect_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.3700.0000.0001.0001.0001.0000.0000.0000.0000.7260.0001.0000.9991.0000.000
Necrosis_pathology0.1620.1621.0001.0000.1620.1620.4010.3540.3620.2970.2210.1810.3170.0000.0000.0001.0000.1000.1000.1000.1000.1000.1001.0000.0000.3160.0000.0000.1700.1830.0000.1120.1810.1810.3011.0000.1130.3830.0001.0000.3170.0000.0000.8910.0001.0000.5170.0360.8340.0000.3170.8910.0000.3830.3540.0000.1810.0000.9930.2931.0000.1810.3170.2240.2710.0000.0000.7750.2310.0000.0000.000
Occupation_df_pinfo_paciente_specimens0.9990.9991.0001.0000.9990.9990.9991.0000.9990.9990.7040.9991.0001.0001.0001.0000.9990.9940.9940.9940.9940.9940.9941.0001.0000.9991.0001.0000.9990.9991.0000.9990.9990.9990.8510.9991.0001.0001.0000.3171.0000.1671.0000.0900.0670.9990.3540.3540.9370.3711.0000.0900.0671.0000.9990.1330.9991.0000.9660.9521.0000.9991.0000.7130.4540.0361.0000.3940.6131.0001.0000.167
OncoKB Cancer GenePanel Data Avail_samples0.1070.1071.0001.0000.1070.1070.0740.1660.1200.0680.0170.1060.1670.0000.0000.0001.0000.1330.1330.1330.1330.1330.1331.0000.0000.1120.0000.0000.0360.1430.0000.1160.1060.1060.0951.0000.0710.1520.0000.0000.1671.0000.0000.0000.4041.0000.0510.2900.1350.0330.1670.0000.4040.1520.1000.9890.1060.0000.1150.9811.0000.1060.1670.0720.1060.0000.0000.0400.0510.0000.0000.999
Oncogenicity_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9961.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.3700.0000.0001.0001.0001.0000.0000.0000.0000.7260.0001.0000.9991.0000.000
PDM Type_pathology0.0340.0341.0001.0000.0340.0340.0710.0910.0730.0350.1090.0310.0900.0000.0000.0001.0000.0620.0620.0620.0620.0620.0621.0000.0000.0180.0000.0000.0000.0180.0000.0560.0310.0310.0841.0000.0270.0990.0000.8910.0900.0000.0001.0000.0001.0001.0000.0000.5480.0000.0900.9980.0000.0990.0240.0000.0310.0000.9920.0001.0000.0310.0900.0340.4000.0000.0000.8920.0680.0000.0000.000
PDM Type_samples0.0280.0281.0001.0000.0280.0280.0710.0660.0600.0000.1090.0250.0670.0000.0000.0001.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0250.0250.0001.0000.0270.0750.0000.0000.0670.4040.0000.0001.0001.0000.0001.0000.0000.1790.0670.0000.9980.0750.0000.3990.0250.0000.0000.9921.0000.0250.0670.0000.0000.0000.0000.0000.0140.0000.0000.404
PDX GrowthCurve Avail_specimen_specimens0.1440.1440.0000.0000.0000.0000.2310.9990.6850.8930.0210.0190.8191.0001.0001.0000.9161.0001.0001.0001.0001.0001.0000.0001.0000.8161.0001.0000.1160.0991.0000.0000.8160.8161.0000.9161.0000.9991.0001.0000.9991.0001.0001.0001.0001.0001.0001.0001.0001.0000.9991.0001.0000.9990.0741.0000.0001.0001.0001.0000.0000.0000.9990.4141.0000.0001.0001.0001.0001.0001.0001.000
Passage_pathology0.4680.4681.0001.0000.4680.4680.3870.3320.2430.2740.3430.5890.3540.0000.0000.0001.0000.0000.0000.0000.0000.0000.0001.0000.0000.3350.0000.0000.3380.2640.0000.1670.5890.5890.2101.0000.6560.3800.0000.5170.3540.0510.0001.0000.0001.0001.0000.1930.8290.0000.3541.0000.0000.3800.3740.0530.5890.0000.9920.3141.0000.5890.3540.1590.4720.0000.0000.4730.3200.0000.0000.051
Passage_samples0.4680.4681.0001.0000.4680.4680.3870.3320.2450.2760.3430.5900.3540.0000.0000.0001.0000.0000.0000.0000.0000.0000.0001.0000.0000.3380.0000.0000.3380.2750.0000.1660.5900.5900.2021.0000.6560.3800.0000.0360.3540.2900.0000.0001.0001.0000.1931.0000.3341.0000.3540.0001.0000.3800.3780.2890.5900.0000.3170.9921.0000.5900.3540.1610.2010.0000.0000.1170.2970.0000.0000.290
Pathology Notes_pathology0.9520.9521.0001.0000.9520.9520.9430.9180.9200.9500.6630.9460.9370.0000.0000.0001.0000.9990.9990.9990.9990.9990.9991.0000.0000.9600.0000.0000.9670.9340.0000.9370.9460.9460.8761.0000.9980.9620.0000.8340.9370.1350.0000.5480.0001.0000.8290.3341.0000.3340.9370.5480.0000.9620.9970.0980.9460.0000.9750.4341.0000.9460.9370.6810.7880.0000.0000.7640.8010.0000.0000.135
PathologyAvail_samples0.0000.0001.0001.0000.0000.0000.0000.3710.1920.1931.0000.0000.3711.0001.0001.0001.0000.3550.3550.3550.3550.3550.3551.0001.0000.1551.0001.0000.0000.2181.0000.3720.0000.0000.1731.0001.0000.3711.0000.0000.3710.0331.0000.0000.1791.0000.0001.0000.3341.0000.3710.0000.1790.3710.1150.0340.0001.0000.3510.4031.0000.0000.3710.1920.1400.0001.0000.0640.0981.0001.0000.033
Patient ID0.9990.9990.9990.9990.9990.9990.9991.000-0.6480.0600.7020.9991.0001.0001.0001.0000.9990.9940.9940.9940.9940.9940.9940.9991.0000.9991.0001.0000.9990.9991.0000.9990.9990.9990.8510.9991.0001.0001.0000.3171.0000.1671.0000.0900.0670.9990.3540.3540.9370.3711.0000.0900.0671.0000.9990.1330.9991.0000.9660.9520.9990.9991.0000.687-0.0700.032NaN-0.0140.613NaN1.0000.167
Patient/OriginatingSpecimen_pathology0.0340.0341.0001.0000.0340.0340.0710.0910.0730.0350.1090.0310.0900.0000.0000.0001.0000.0620.0620.0620.0620.0620.0621.0000.0000.0180.0000.0000.0000.0180.0000.0560.0310.0310.0841.0000.0270.0990.0000.8910.0900.0000.0000.9980.0001.0001.0000.0000.5480.0000.0901.0000.0000.0990.0240.0000.0310.0000.9920.0001.0000.0310.0900.0340.4000.0000.0000.8920.0680.0000.0000.000
Patient/OriginatingSpecimen_samples0.0280.0281.0001.0000.0280.0280.0710.0660.0600.0000.1090.0250.0670.0000.0000.0001.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0250.0250.0001.0000.0270.0750.0000.0000.0670.4040.0000.0000.9981.0000.0001.0000.0000.1790.0670.0001.0000.0750.0000.3990.0250.0000.0000.9921.0000.0250.0670.0000.0000.0000.0000.0000.0140.0000.0000.404
PatientNotes_df_pinfo_paciente_specimens0.9990.9991.0001.0000.9990.9990.9991.0001.0001.0000.7060.9991.0001.0001.0001.0000.9990.9940.9940.9940.9940.9940.9941.0001.0001.0001.0001.0000.9990.9991.0000.9990.9990.9990.7600.9991.0001.0001.0000.3831.0000.1521.0000.0990.0750.9990.3800.3800.9620.3711.0000.0990.0751.0000.9990.1070.9991.0000.9950.9691.0000.9991.0000.7140.5590.0481.0000.3740.6131.0001.0000.152
ProvidedTissue Origin_specimen_specimens0.5720.5720.0440.0440.5630.5630.5350.9990.8860.8651.0000.5680.9991.0001.0001.0000.0740.9940.9940.9940.9940.9940.9940.0441.0000.9991.0001.0000.6620.5401.0000.6270.5720.5720.8860.0741.0000.9991.0000.3540.9990.1001.0000.0240.0000.0740.3740.3780.9970.1150.9990.0240.0000.9991.0000.0820.5671.0000.9920.9920.0440.5670.9990.7860.6600.0141.0000.4620.4441.0001.0000.100
RNASeqAvail_samples0.1090.1091.0001.0000.1090.1090.0740.1260.1060.0380.0170.1090.1330.0000.0000.0001.0000.0720.0720.0720.0720.0720.0721.0000.0000.0890.0000.0000.0400.1100.0000.0600.1090.1090.0681.0000.0710.1070.0000.0000.1330.9890.0000.0000.3991.0000.0530.2890.0980.0340.1330.0000.3990.1070.0821.0000.1090.0000.0580.9811.0000.1090.1330.0440.0850.0000.0000.0320.0490.0000.0000.989
Race_df_pinfo_paciente_specimens0.9950.9951.0001.0000.7070.7070.4810.9990.7450.6320.7690.9950.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0000.4211.0001.0000.3400.7991.0000.7101.0001.0000.3890.0001.0000.9991.0000.1810.9990.1061.0000.0310.0250.0000.5890.5900.9460.0000.9990.0310.0250.9990.5670.1091.0001.0000.9620.9531.0001.0000.9990.5050.6210.0351.0000.3980.6311.0001.0000.106
Sample ID_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0000.3700.3700.5251.0001.0001.0001.0001.0001.0001.0001.0000.6421.0000.3700.3701.0001.0000.5251.0001.0001.0001.0001.0001.0001.0000.3700.0001.0000.0000.3700.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0001.0000.0000.0001.0001.0001.0000.0000.0000.0000.7630.0001.0000.5760.1320.000
Sample ID_pathology0.9400.9401.0001.0000.9400.9400.9630.9950.9690.9650.5880.9620.9660.0000.0000.0001.0000.9970.9970.9970.9970.9970.9971.0000.0000.9700.0000.0000.9710.9630.0000.9750.9620.9620.9821.0000.9970.9950.0000.9930.9660.1150.0000.9920.0001.0000.9920.3170.9750.3510.9660.9920.0000.9950.9920.0580.9620.0001.0000.2951.0000.9620.9660.6880.9930.0000.0000.9910.9690.0000.0000.115
Sample ID_samples0.9350.9351.0001.0000.9350.9350.9600.9720.9660.9600.5880.9530.9520.0000.0000.0001.0000.9970.9970.9970.9970.9970.9971.0000.0000.9680.0000.0000.9620.9500.0000.9420.9530.9530.8171.0000.9970.9690.0000.2930.9520.9810.0000.0000.9921.0000.3140.9920.4340.4030.9520.0000.9920.9690.9920.9810.9530.0000.2951.0001.0000.9530.9520.6840.3840.0000.0000.3010.5650.0000.0000.981
Self-ReportedEthnicity_df_race_paciente_specimens1.0001.0000.8330.8330.0000.0000.1621.0000.7730.7730.9920.0850.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0000.8331.0000.0681.0001.0000.0721.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0000.9991.0001.0001.0000.0441.0001.0001.0001.0001.0001.0001.0000.9990.5781.0000.0001.0001.0001.0001.0001.0001.000
Self-ReportedRace_df_race_paciente_specimens0.9950.9951.0001.0000.7070.7070.4810.9990.7450.6320.7690.9950.9991.0001.0001.0000.0001.0001.0001.0001.0001.0001.0001.0001.0000.4211.0001.0000.3400.7991.0000.7101.0001.0000.3890.0001.0000.9991.0000.1810.9990.1061.0000.0310.0250.0000.5890.5900.9460.0000.9990.0310.0250.9990.5670.1091.0001.0000.9620.9531.0001.0000.9990.5050.6210.0351.0000.3980.6311.0001.0000.106
Specimen ID0.9990.9990.9990.9990.9990.9990.9991.0000.9990.9990.7020.9991.0001.0001.0001.0000.9990.9940.9940.9940.9940.9940.9940.9991.0000.9991.0001.0000.9990.9991.0000.9990.9990.9990.8510.9991.0001.0001.0000.3171.0000.1671.0000.0900.0670.9990.3540.3540.9370.3711.0000.0900.0671.0000.9990.1330.9991.0000.9660.9520.9990.9991.0000.6870.4540.0321.0000.3940.6131.0001.0000.167
StandardizedRegimen_trathistory_paciente_specimens0.4660.4660.5780.5780.3940.3940.3670.7060.6610.6030.9950.4300.6860.0000.0000.0000.4141.0001.0001.0001.0001.0001.0000.5780.0000.5900.0000.0000.6460.5390.0000.6640.5040.5040.6570.4140.6620.7170.0000.2240.7130.0720.0000.0340.0000.4140.1590.1610.6810.1920.6870.0340.0000.7140.7860.0440.5050.0000.6880.6840.5780.5050.6871.0000.3160.9180.0000.3400.1680.0000.0000.072
Stromal_pathology0.5490.5491.0001.0000.5490.5490.4490.5580.201-0.0540.4800.6210.4540.0000.0000.0001.0000.8160.8160.8160.8160.8160.8161.0000.0000.4900.0000.0000.5440.5300.0000.3840.6210.6210.4911.0000.9100.5590.0000.2710.4540.1060.0000.4000.0001.0000.4720.2010.7880.140-0.0700.4000.0000.5590.6600.0850.6210.0000.9930.3841.0000.6210.4540.3161.0000.0000.000-0.6690.3270.0000.0000.106
Timing_trathistory_paciente_specimens0.0550.0550.0000.0000.0590.0590.0590.0000.0000.0190.9920.0380.0370.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0260.0000.0000.0310.0000.0000.0000.0290.0290.0000.0000.0650.0560.0000.0000.0360.0000.0000.0000.0000.0000.0000.0000.0000.0000.0320.0000.0000.0480.0140.0000.0350.0000.0000.0000.0000.0350.0320.9180.0001.0000.0000.0000.0530.0000.0000.000
Total Reads_gene1.0001.0001.0001.0001.0001.0001.0001.000NaNNaN0.0001.0001.0000.7260.7261.0001.0001.0001.0001.0001.0001.0001.0001.0000.9991.0000.7260.7261.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.7260.0001.0000.0000.7260.0000.0001.0000.0000.0000.0001.000NaN0.0000.0001.0001.0000.0001.0000.7630.0000.0001.0001.0001.0000.0000.0000.0001.0000.0001.0000.3270.3490.000
Tumor Content_pathology0.3780.3781.0001.0000.3780.3780.3630.462-0.1030.0950.4200.3980.3940.0000.0000.0001.0000.2770.2770.2770.2770.2770.2771.0000.0000.4280.0000.0000.3470.4230.0000.3570.3980.3980.4331.0000.1750.3740.0000.7750.3940.0400.0000.8920.0001.0000.4730.1170.7640.064-0.0140.8920.0000.3740.4620.0320.3980.0000.9910.3011.0000.3980.3940.340-0.6690.0000.0001.0000.1520.0000.0000.040
Tumor Grade_pathology0.6790.6791.0001.0000.6790.6790.7240.3760.2220.5590.6620.6310.6131.0001.0001.0001.0000.3860.3860.3860.3860.3860.3861.0001.0000.5561.0001.0000.5430.1681.0000.2000.6310.6310.1851.0001.0000.6131.0000.2310.6130.0511.0000.0680.0141.0000.3200.2970.8010.0980.6130.0680.0140.6130.4440.0490.6311.0000.9690.5651.0000.6310.6130.1680.3270.0531.0000.1521.0001.0001.0000.051
Variant AlleleFrequency_gene1.0001.0001.0001.0001.0001.0001.0001.000NaNNaN0.0001.0001.0000.9990.9990.9991.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9990.9991.0001.0000.9991.0001.0001.0001.0001.0001.0001.0000.9990.0001.0000.0000.9990.0000.0001.0000.0000.0000.0001.000NaN0.0000.0001.0001.0000.0001.0000.5760.0000.0001.0001.0001.0000.0000.0000.0000.3270.0001.0001.0000.9990.000
VariantClass_gene1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.0001.0000.2651.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.2651.0001.0001.0001.0001.0001.0001.0001.0000.0001.0000.0001.0000.0000.0001.0000.0000.0000.0001.0001.0000.0000.0001.0001.0000.0001.0000.1320.0000.0001.0001.0001.0000.0000.0000.0000.3490.0001.0000.9991.0000.000
Whole ExomeSequence Avail_samples0.1070.1071.0001.0000.1070.1070.0740.1660.1200.0680.0170.1060.1670.0000.0000.0001.0000.1330.1330.1330.1330.1330.1331.0000.0000.1120.0000.0000.0360.1430.0000.1160.1060.1060.0951.0000.0710.1520.0000.0000.1670.9990.0000.0000.4041.0000.0510.2900.1350.0330.1670.0000.4040.1520.1000.9890.1060.0000.1150.9811.0000.1060.1670.0720.1060.0000.0000.0400.0510.0000.0001.000

Missing values

2025-07-15T01:50:30.319908image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-07-15T01:50:30.970797image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-07-15T01:50:31.934656image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Patient IDDiagnosisSubtype_trathistory_paciente_specimensTiming_trathistory_paciente_specimensDate RegimenStarted_trathistory_paciente_specimensStandardizedRegimen_trathistory_paciente_specimensBest Response_trathistory_paciente_specimensDiagnosisSubtype_df_race_paciente_specimensSelf-ReportedRace_df_race_paciente_specimensSelf-ReportedEthnicity_df_race_paciente_specimens%European(CEU)_df_race_paciente_specimens%Native and LatinAmerican (NA)_df_race_paciente_specimens%West African(YRI)_df_race_paciente_specimensInferredAncestry_df_race_paciente_specimensBiologicalSex_df_pinfo_paciente_specimensDiagnosisSubtype_df_pinfo_paciente_specimensAge atDiagnosis_df_pinfo_paciente_specimens%European(CEU)_df_pinfo_paciente_specimens%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens%West African(YRI)_df_pinfo_paciente_specimensAdditionalMedicalHistory_df_pinfo_paciente_specimensDate ofDiagnosis_df_pinfo_paciente_specimensEthnicity_df_pinfo_paciente_specimensGrade StageInformation_df_pinfo_paciente_specimensHas KnownMetastaticDisease_df_pinfo_paciente_specimensHas Smoked100 Cigarettes_df_pinfo_paciente_specimensInferredAncestry_df_pinfo_paciente_specimensMolecular andIHC Data_df_pinfo_paciente_specimensOccupation_df_pinfo_paciente_specimensPatientNotes_df_pinfo_paciente_specimensRace_df_pinfo_paciente_specimensSpecimen IDBiopsySite_specimen_specimensDiagnosisSubtype_specimen_specimensPDX GrowthCurve Avail_specimen_specimensConsensus Whole ExomeSequence Avail_specimen_specimensMSI Status_specimen_specimensCollectionDate_specimen_specimensHuman PathogenTesting Summary_specimen_specimensAble to ViablyPassage in nude mice_specimen_specimensAge atSampling_specimen_specimensModelNotes_specimen_specimensProvidedTissue Origin_specimen_specimensSample ID_samplesDiagnosisSubtype_samplesPatient/OriginatingSpecimen_samplesPassage_samplesPathologyAvail_samplesOncoKB Cancer GenePanel Data Avail_samplesWhole ExomeSequence Avail_samplesRNASeqAvail_samplesPDM Type_samplesSample ID_geneHugoSymbol_geneChr_geneChr End_geneChr Start_geneHGVS ProteinChange_geneHGVS cDNAChange_geneTotal Reads_geneVariant AlleleFrequency_geneVariantClass_geneMutationEffect_geneOncogenicity_geneExisting Variant_geneDiagnosisSubtype_pathologySample ID_pathologyPatient/OriginatingSpecimen_pathologyPassage_pathologyPDM Type_pathologyTumor Grade_pathologyTumor Content_pathologyNecrosis_pathologyStromal_pathologyInflammatory Cell_pathologyPathology Notes_pathology
0111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH5No0.0PDX<Unknown>95.00.05.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.\n
1111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH5Q86T81No2.0PDX<Unknown>85.010.05.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. A previous passage of the tumor was positive for CD34. \nConventional grading of GIST is not done.
2111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH5Q89No1.0PDX<Unknown>85.00.015.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The previous passage of the tumor was positive for CD34. \nConventional grading of GIST is not done.
3111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH6T26No1.0PDX<Unknown>90.00.010.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
4111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH6T28No1.0PDX<Unknown>90.00.010.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
5111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH8No0.0PDX<Unknown>95.00.05.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34 in another passage. Conventional grading of GIST is not done.
6111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH8Q94No1.0PDX<Unknown>95.00.05.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. Another passage of the tumor was positive for CD34. \nConventional grading of GIST is not done.
7111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeAH9No0.0PDX<Unknown>95.00.05.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34 in another passage. Conventional grading of GIST is not done.
8111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455593654.055593601.0p.Q556_T574delinsPc.1667_1720del487.00.2813In_Frame_DelGain-of-functionOncogenicNaNepithelioid typeOriginatorYesNaNPatient/Originator Specimen<Unknown>40.050.010.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.
9111316epithelioid typeCurrent08/2021RipretinibNaNepithelioid typeWhiteNot Hispanic or Latino10000European (CEU)Maleepithelioid type5810000Final Pathology: Dx confirmed; mitotic rate is 47/30 HPF; treatment effect present w/central necrosis.07/2020Not Hispanic or LatinoGrade, TNM (Clinical)YesNoEuropean (CEU)IHC: DOG1+, CD117+; S100-, CK AE1/AE3-EngineerTumor Grade/Stage: cM0 (at diagnosis); high grade \nLocation of known metastasis: liver, jejunum, ileumWhite319-RLiver [central]epithelioid typeYesYesMSI-Stable11/2021NegativeYes59PDX IHC/Path: CD34+ (diffusely), SMA+ (rare cells)Metastatic SiteAH5epithelioid typeNo0.0YesNoNoNoPDXAH5Q86T81KITchr455599332.055599332.0p.D820Yc.2458G>T501.00.5210Missense_MutationInconclusiveResistanceCM023092,KinMutBase_KIT_DNA:g.76151G>T,COSM12710,COSM17947,COSM22379epithelioid typeAH5No0.0PDX<Unknown>95.00.05.0Absent/No InflammationGastrointestinal stromal tumor (GIST), epithelioid.\nThe tumor is composed of nests of rounded epithelioid cells with clear to eosinophilic cytoplasm showing mild to moderate pleomorphism with no spindle type cells. Rare mitosis observed. The tumor is diffusely positive for CD34, and rare cells are positive for SMA. Conventional grading of GIST is not done.\n
Patient IDDiagnosisSubtype_trathistory_paciente_specimensTiming_trathistory_paciente_specimensDate RegimenStarted_trathistory_paciente_specimensStandardizedRegimen_trathistory_paciente_specimensBest Response_trathistory_paciente_specimensDiagnosisSubtype_df_race_paciente_specimensSelf-ReportedRace_df_race_paciente_specimensSelf-ReportedEthnicity_df_race_paciente_specimens%European(CEU)_df_race_paciente_specimens%Native and LatinAmerican (NA)_df_race_paciente_specimens%West African(YRI)_df_race_paciente_specimensInferredAncestry_df_race_paciente_specimensBiologicalSex_df_pinfo_paciente_specimensDiagnosisSubtype_df_pinfo_paciente_specimensAge atDiagnosis_df_pinfo_paciente_specimens%European(CEU)_df_pinfo_paciente_specimens%Native and LatinAmerican (NA)_df_pinfo_paciente_specimens%West African(YRI)_df_pinfo_paciente_specimensAdditionalMedicalHistory_df_pinfo_paciente_specimensDate ofDiagnosis_df_pinfo_paciente_specimensEthnicity_df_pinfo_paciente_specimensGrade StageInformation_df_pinfo_paciente_specimensHas KnownMetastaticDisease_df_pinfo_paciente_specimensHas Smoked100 Cigarettes_df_pinfo_paciente_specimensInferredAncestry_df_pinfo_paciente_specimensMolecular andIHC Data_df_pinfo_paciente_specimensOccupation_df_pinfo_paciente_specimensPatientNotes_df_pinfo_paciente_specimensRace_df_pinfo_paciente_specimensSpecimen IDBiopsySite_specimen_specimensDiagnosisSubtype_specimen_specimensPDX GrowthCurve Avail_specimen_specimensConsensus Whole ExomeSequence Avail_specimen_specimensMSI Status_specimen_specimensCollectionDate_specimen_specimensHuman PathogenTesting Summary_specimen_specimensAble to ViablyPassage in nude mice_specimen_specimensAge atSampling_specimen_specimensModelNotes_specimen_specimensProvidedTissue Origin_specimen_specimensSample ID_samplesDiagnosisSubtype_samplesPatient/OriginatingSpecimen_samplesPassage_samplesPathologyAvail_samplesOncoKB Cancer GenePanel Data Avail_samplesWhole ExomeSequence Avail_samplesRNASeqAvail_samplesPDM Type_samplesSample ID_geneHugoSymbol_geneChr_geneChr End_geneChr Start_geneHGVS ProteinChange_geneHGVS cDNAChange_geneTotal Reads_geneVariant AlleleFrequency_geneVariantClass_geneMutationEffect_geneOncogenicity_geneExisting Variant_geneDiagnosisSubtype_pathologySample ID_pathologyPatient/OriginatingSpecimen_pathologyPassage_pathologyPDM Type_pathologyTumor Grade_pathologyTumor Content_pathologyNecrosis_pathologyStromal_pathologyInflammatory Cell_pathologyPathology Notes_pathology
2723949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryCVYVL3A19spindle typeNo2.0YesYesYesYesPDXNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVK6No1.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2724949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryCVYVL3A19spindle typeNo2.0YesYesYesYesPDXNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVK8No1.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2725949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryCVYVL3A19spindle typeNo2.0YesYesYesYesPDXNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVL0No1.0PDXIntermediate grade or moderately differentiated85.00.015.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2726949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryCVYVL3A19spindle typeNo2.0YesYesYesYesPDXNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVL3A19No2.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2727949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryORIGINATORspindle typeYesNaNNoYesYesYesPatient/Originator SpecimenNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVVNo0.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2728949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryORIGINATORspindle typeYesNaNNoYesYesYesPatient/Originator SpecimenNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYNo0.0PDX<Unknown>90.00.010.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma may exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2729949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryORIGINATORspindle typeYesNaNNoYesYesYesPatient/Originator SpecimenNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVK6No1.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2730949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryORIGINATORspindle typeYesNaNNoYesYesYesPatient/Originator SpecimenNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVK8No1.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2731949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryORIGINATORspindle typeYesNaNNoYesYesYesPatient/Originator SpecimenNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVL0No1.0PDXIntermediate grade or moderately differentiated85.00.015.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.
2732949853spindle typePriorNaNTreatment naiveNaNspindle typeWhiteNot Hispanic or Latino10000European (CEU)Malespindle type3910000Final Pathology: Stomach - Dx confirmed, predominantly spindle-cell type. Mitotic count: 80 per 5mm2. Liver: Metastatic gastrointestinal stromal sarcoma.10/2015Not Hispanic or LatinoNone ProvidedYesYesEuropean (CEU)IHC: DOG1+, CD117+, and PDGFR+ consistent with a gastrointestinal stromal tumor (GIST).Plastics factory workerLocation of know metastases: liverWhite013-RStomachspindle typeYesYesMSI-Stable01/2016Negative\nYes39NaNPrimaryORIGINATORspindle typeYesNaNNoYesYesYesPatient/Originator SpecimenNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNspindle typeCVYVL3A19No2.0PDX<Unknown>90.00.00.0<Unknown>Spindle cell Gastrointestinal Stromal Tumors (GIST) account for nearly 70% of cases and are composed of cells arranged in short fascicles and whorls. The stroma exhibit areas of myxoid change. The individual cells reveal ill-defined cell borders with ovoid nuclei, fine nuclear chromatin, and inconspicuous nucleoli. The cytoplasm has a pale, eosinophilic, and fibrillary quality. Many gastric spindle cell GISTs show extensive paranuclear vacuolization. The degree of paranuclear vacuolization, however, is much more pronounced in GISTs than it is in smooth muscle tumors.